시각화

데이터프레임 시각화를 학습합니다.
저자
소속
이광춘

TCS

공개

2023년 01월 16일

1 그래프 문법

2 시각화 패키지

  • 정적 그래프
    • plotnine
    • matplotlib
      • NumPy 친화적 그래픽 패키지
    • seaborn
      • matplotlib 을 기반으로 하고 pandas 자료구조 친화적 그래픽 패키지
  • 인터랙티브 그래프

3 헬로월드

동일한 시각화 그래프를 다양한 패키지를 활용하여 시각화해보자. seaborn, plotnine 패키지를 사전에 설치한다.

$ pip3 install seaborn
$ pip3 install plotnine
library(tidyverse)
library(gapminder)

ggplot(data = gapminder, aes(x = lifeExp, y = gdpPercap)) +
  geom_point()

from plotnine import *
from gapminder import gapminder

( ggplot(gapminder)
   + geom_point(aes('lifeExp', 'gdpPercap'))
)
#> <ggplot: (161573666082)>

import matplotlib.pyplot as plt
from gapminder import gapminder

# 산점도 정의
plt.scatter(gapminder['lifeExp'], gapminder['gdpPercap'])

# 축 라벨 추가
plt.xlabel('Life Expectancy')
plt.ylabel('GDP per Capita')

# 그래프 보이기
plt.show()

import seaborn as sns
from gapminder import gapminder

# 산점도 정의
sns.scatterplot(x='lifeExp', y='gdpPercap', data=gapminder)

# 축 라벨 추가
plt.xlabel('Life Expectancy')
plt.ylabel('GDP per Capita')

# 그래프 보이기
plt.show()

4 그래프 문법

4.1 시각화 연습 1

library(tidyverse)
library(gapminder)

gapminder %>% 
  ggplot(aes(x=year, y=lifeExp, group = country, color=continent)) +
    geom_line() +
    labs(x = "연도", y = "기대수명",
         title = "국가별 평균 기대수명 변화 추이")

import matplotlib.pyplot as plt 
import plotnine as p9
from gapminder import gapminder

( p9.ggplot(gapminder)
    + p9.aes(x='year', y='lifeExp', group='country', color='continent')
    + p9.geom_line()
    + p9.labs(x='연도', y='기대수명', title='국가별 평균 기대수명 변화 추이') 
)
#> <ggplot: (161579175412)>
#> 
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 50672 (\N{HANGUL SYLLABLE YEON}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 46020 (\N{HANGUL SYLLABLE DO}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44592 (\N{HANGUL SYLLABLE GI}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 45824 (\N{HANGUL SYLLABLE DAE}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 49688 (\N{HANGUL SYLLABLE SU}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 47749 (\N{HANGUL SYLLABLE MYEONG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44397 (\N{HANGUL SYLLABLE GUG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44032 (\N{HANGUL SYLLABLE GA}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 48324 (\N{HANGUL SYLLABLE BYEOL}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 54217 (\N{HANGUL SYLLABLE PYEONG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44512 (\N{HANGUL SYLLABLE GYUN}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 48320 (\N{HANGUL SYLLABLE BYEON}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 54868 (\N{HANGUL SYLLABLE HWA}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 52628 (\N{HANGUL SYLLABLE CU}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 51060 (\N{HANGUL SYLLABLE I}) missing from current font.

import matplotlib.pyplot as plt
from gapminder import gapminder

plt.figure(figsize=(10,5))

continent_colors = {'Africa':'red', 'Americas':'blue', 'Asia':'green', 'Europe':'black', 'Oceania':'purple'}

for continent, group in gapminder.groupby('continent'):
    color = continent_colors[continent]
    for country, sub_group in group.groupby('country'):
        plt.plot(sub_group['year'], sub_group['lifeExp'], color = color)

plt.xlabel("연도")
plt.ylabel("기대수명")
plt.title("국가별 평균 기대수명 변화 추이")

# create a custom legend
for continent, color in continent_colors.items():
    plt.scatter([], [], c=color, label=continent)
    
plt.legend(scatterpoints=1, frameon=False, labelspacing=0.3, bbox_to_anchor=(1.1, 1), loc='upper right', bbox_transform=plt.gcf().transFigure)

plt.show()
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44397 (\N{HANGUL SYLLABLE GUG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44032 (\N{HANGUL SYLLABLE GA}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 48324 (\N{HANGUL SYLLABLE BYEOL}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 54217 (\N{HANGUL SYLLABLE PYEONG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44512 (\N{HANGUL SYLLABLE GYUN}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44592 (\N{HANGUL SYLLABLE GI}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 45824 (\N{HANGUL SYLLABLE DAE}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 49688 (\N{HANGUL SYLLABLE SU}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 47749 (\N{HANGUL SYLLABLE MYEONG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 48320 (\N{HANGUL SYLLABLE BYEON}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 54868 (\N{HANGUL SYLLABLE HWA}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 52628 (\N{HANGUL SYLLABLE CU}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 51060 (\N{HANGUL SYLLABLE I}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 50672 (\N{HANGUL SYLLABLE YEON}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 46020 (\N{HANGUL SYLLABLE DO}) missing from current font.

import matplotlib.pyplot as plt
from gapminder import gapminder

import seaborn as sns

continent_colors = {'Africa':'red', 'Americas':'blue', 'Asia':'green', 'Europe':'black', 'Oceania':'purple'}

sns.set(rc={'figure.figsize':(10,5)})

for continent, group in gapminder.groupby('continent'):
    color = continent_colors[continent]
    for country, sub_group in group.groupby('country'):
        sns.lineplot(x="year", y="lifeExp", data=sub_group, color=color)

plt.xlabel("연도")
plt.ylabel("기대수명")
plt.title("국가별 평균 기대수명 변화 추이")

# create a custom legend
for continent, color in continent_colors.items():
    plt.scatter([], [], c=color, label=continent)
    
plt.legend(scatterpoints=1, frameon=False, labelspacing=0.3, bbox_to_anchor=(1.1, 1), loc='upper right', bbox_transform=plt.gcf().transFigure)    

plt.show()
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 50672 (\N{HANGUL SYLLABLE YEON}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 46020 (\N{HANGUL SYLLABLE DO}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44397 (\N{HANGUL SYLLABLE GUG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44032 (\N{HANGUL SYLLABLE GA}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 48324 (\N{HANGUL SYLLABLE BYEOL}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 54217 (\N{HANGUL SYLLABLE PYEONG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44512 (\N{HANGUL SYLLABLE GYUN}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 44592 (\N{HANGUL SYLLABLE GI}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 45824 (\N{HANGUL SYLLABLE DAE}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 49688 (\N{HANGUL SYLLABLE SU}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 47749 (\N{HANGUL SYLLABLE MYEONG}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 48320 (\N{HANGUL SYLLABLE BYEON}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 54868 (\N{HANGUL SYLLABLE HWA}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 52628 (\N{HANGUL SYLLABLE CU}) missing from current font.
#> C:\Users\statkclee\AppData\Local\R\win-library\4.2\reticulate\python\rpytools\call.py:13: UserWarning: Glyph 51060 (\N{HANGUL SYLLABLE I}) missing from current font.

4.2 시각화 연습 2

2002년 기준 각 대륙별 1인당 GDP와 평균수명을 시각화하세요.

gapminder::gapminder %>% 
  filter(year == 2002) %>% 
  ggplot() + 
    aes(x = gdpPercap) + 
    aes(y = lifeExp) + 
    geom_point() + 
    aes(color = continent) + 
    aes(size = pop/1000000) + 
    labs(size = "Population\n(millions)") + 
    labs(color = NULL) + 
    labs(x = "Per Capita GDP ($US)") + 
    labs(y = "Life expecancy (years)") + 
    labs(title = "Life expectancy vs Per Capita GDP, 2002") + 
    labs(subtitle = "Data Source: gapminder package") + 
    labs(caption = "Produced for MA206 in Fall AY2023") + 
    facet_wrap(~ continent)

```