winemag-data-150k-v2.csv 파일을 reviews 로 읽는다.
In [1]:
import pandas as pd
In [ ]:
# 파일을 불러올때 필요없는 열을 안보여주는 방법 Unnamed 열을 없애기.
In [4]:
pd.read_csv('../data/winemag-data_first150k.csv') ##../ => 이전폴더로 이동, 여기서 tap 을 누르면 자동으로 폴더리스트가 보인다
Out[4]:
Unnamed: 0 | country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | US | This tremendous 100% varietal wine hails from ... | Martha's Vineyard | 96 | 235.0 | California | Napa Valley | Napa | Cabernet Sauvignon | Heitz |
1 | 1 | Spain | Ripe aromas of fig, blackberry and cassis are ... | Carodorum Selección Especial Reserva | 96 | 110.0 | Northern Spain | Toro | NaN | Tinta de Toro | Bodega Carmen Rodríguez |
2 | 2 | US | Mac Watson honors the memory of a wine once ma... | Special Selected Late Harvest | 96 | 90.0 | California | Knights Valley | Sonoma | Sauvignon Blanc | Macauley |
3 | 3 | US | This spent 20 months in 30% new French oak, an... | Reserve | 96 | 65.0 | Oregon | Willamette Valley | Willamette Valley | Pinot Noir | Ponzi |
4 | 4 | France | This is the top wine from La Bégude, named aft... | La Brûlade | 95 | 66.0 | Provence | Bandol | NaN | Provence red blend | Domaine de la Bégude |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
150925 | 150925 | Italy | Many people feel Fiano represents southern Ita... | NaN | 91 | 20.0 | Southern Italy | Fiano di Avellino | NaN | White Blend | Feudi di San Gregorio |
150926 | 150926 | France | Offers an intriguing nose with ginger, lime an... | Cuvée Prestige | 91 | 27.0 | Champagne | Champagne | NaN | Champagne Blend | H.Germain |
150927 | 150927 | Italy | This classic example comes from a cru vineyard... | Terre di Dora | 91 | 20.0 | Southern Italy | Fiano di Avellino | NaN | White Blend | Terredora |
150928 | 150928 | France | A perfect salmon shade, with scents of peaches... | Grand Brut Rosé | 90 | 52.0 | Champagne | Champagne | NaN | Champagne Blend | Gosset |
150929 | 150929 | Italy | More Pinot Grigios should taste like this. A r... | NaN | 90 | 15.0 | Northeastern Italy | Alto Adige | NaN | Pinot Grigio | Alois Lageder |
150930 rows × 11 columns
In [3]:
pd.read_csv('../data/winemag-data_first150k.csv', index_col= 0) # 0또는 "Unnamed: 0" (원하는 인덱스를 입력)
Out[3]:
country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|
0 | US | This tremendous 100% varietal wine hails from ... | Martha's Vineyard | 96 | 235.0 | California | Napa Valley | Napa | Cabernet Sauvignon | Heitz |
1 | Spain | Ripe aromas of fig, blackberry and cassis are ... | Carodorum Selección Especial Reserva | 96 | 110.0 | Northern Spain | Toro | NaN | Tinta de Toro | Bodega Carmen Rodríguez |
2 | US | Mac Watson honors the memory of a wine once ma... | Special Selected Late Harvest | 96 | 90.0 | California | Knights Valley | Sonoma | Sauvignon Blanc | Macauley |
3 | US | This spent 20 months in 30% new French oak, an... | Reserve | 96 | 65.0 | Oregon | Willamette Valley | Willamette Valley | Pinot Noir | Ponzi |
4 | France | This is the top wine from La Bégude, named aft... | La Brûlade | 95 | 66.0 | Provence | Bandol | NaN | Provence red blend | Domaine de la Bégude |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
150925 | Italy | Many people feel Fiano represents southern Ita... | NaN | 91 | 20.0 | Southern Italy | Fiano di Avellino | NaN | White Blend | Feudi di San Gregorio |
150926 | France | Offers an intriguing nose with ginger, lime an... | Cuvée Prestige | 91 | 27.0 | Champagne | Champagne | NaN | Champagne Blend | H.Germain |
150927 | Italy | This classic example comes from a cru vineyard... | Terre di Dora | 91 | 20.0 | Southern Italy | Fiano di Avellino | NaN | White Blend | Terredora |
150928 | France | A perfect salmon shade, with scents of peaches... | Grand Brut Rosé | 90 | 52.0 | Champagne | Champagne | NaN | Champagne Blend | Gosset |
150929 | Italy | More Pinot Grigios should taste like this. A r... | NaN | 90 | 15.0 | Northeastern Italy | Alto Adige | NaN | Pinot Grigio | Alois Lageder |
150930 rows × 10 columns
In [5]:
reviews = pd.read_csv('../data/winemag-data_first150k.csv', index_col= 0)
In [7]:
# 데이터프레임의 맨윗부분 5개 데이터를 보여준다.
reviews.head()
Out[7]:
country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|
0 | US | This tremendous 100% varietal wine hails from ... | Martha's Vineyard | 96 | 235.0 | California | Napa Valley | Napa | Cabernet Sauvignon | Heitz |
1 | Spain | Ripe aromas of fig, blackberry and cassis are ... | Carodorum Selección Especial Reserva | 96 | 110.0 | Northern Spain | Toro | NaN | Tinta de Toro | Bodega Carmen Rodríguez |
2 | US | Mac Watson honors the memory of a wine once ma... | Special Selected Late Harvest | 96 | 90.0 | California | Knights Valley | Sonoma | Sauvignon Blanc | Macauley |
3 | US | This spent 20 months in 30% new French oak, an... | Reserve | 96 | 65.0 | Oregon | Willamette Valley | Willamette Valley | Pinot Noir | Ponzi |
4 | France | This is the top wine from La Bégude, named aft... | La Brûlade | 95 | 66.0 | Provence | Bandol | NaN | Provence red blend | Domaine de la Bégude |
In [8]:
# 데이터 프레임의 맨 끝 5개 데이터를 확인하는 방법
reviews.tail()
Out[8]:
country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|
150925 | Italy | Many people feel Fiano represents southern Ita... | NaN | 91 | 20.0 | Southern Italy | Fiano di Avellino | NaN | White Blend | Feudi di San Gregorio |
150926 | France | Offers an intriguing nose with ginger, lime an... | Cuvée Prestige | 91 | 27.0 | Champagne | Champagne | NaN | Champagne Blend | H.Germain |
150927 | Italy | This classic example comes from a cru vineyard... | Terre di Dora | 91 | 20.0 | Southern Italy | Fiano di Avellino | NaN | White Blend | Terredora |
150928 | France | A perfect salmon shade, with scents of peaches... | Grand Brut Rosé | 90 | 52.0 | Champagne | Champagne | NaN | Champagne Blend | Gosset |
150929 | Italy | More Pinot Grigios should taste like this. A r... | NaN | 90 | 15.0 | Northeastern Italy | Alto Adige | NaN | Pinot Grigio | Alois Lageder |
In [9]:
# 맨위에 있는 데이터 2개만 보고싶을 때
reviews.head(2)
Out[9]:
country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|
0 | US | This tremendous 100% varietal wine hails from ... | Martha's Vineyard | 96 | 235.0 | California | Napa Valley | Napa | Cabernet Sauvignon | Heitz |
1 | Spain | Ripe aromas of fig, blackberry and cassis are ... | Carodorum Selección Especial Reserva | 96 | 110.0 | Northern Spain | Toro | NaN | Tinta de Toro | Bodega Carmen Rodríguez |
In [10]:
# 맨 끝에 있는 데이터 2개만 보자
In [11]:
reviews.tail(2)
Out[11]:
country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|
150928 | France | A perfect salmon shade, with scents of peaches... | Grand Brut Rosé | 90 | 52.0 | Champagne | Champagne | NaN | Champagne Blend | Gosset |
150929 | Italy | More Pinot Grigios should taste like this. A r... | NaN | 90 | 15.0 | Northeastern Italy | Alto Adige | NaN | Pinot Grigio | Alois Lageder |
In [14]:
# 데이터의 갯수, 열의 갯수
reviews.shape
Out[14]:
(150930, 10)
In [13]:
# 숫자로 되어있는 칼람만 통계
# count : 비어있는 데이터 빼고 몇개냐.
# 50% : 중앙값
reviews.describe()
Out[13]:
points | price | |
---|---|---|
count | 150930.000000 | 137235.000000 |
mean | 87.888418 | 33.131482 |
std | 3.222392 | 36.322536 |
min | 80.000000 | 4.000000 |
25% | 86.000000 | 16.000000 |
50% | 88.000000 | 24.000000 |
75% | 90.000000 | 40.000000 |
max | 100.000000 | 2300.000000 |
In [15]:
reviews.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 150930 entries, 0 to 150929 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 country 150925 non-null object 1 description 150930 non-null object 2 designation 105195 non-null object 3 points 150930 non-null int64 4 price 137235 non-null float64 5 province 150925 non-null object 6 region_1 125870 non-null object 7 region_2 60953 non-null object 8 variety 150930 non-null object 9 winery 150930 non-null object dtypes: float64(1), int64(1), object(8) memory usage: 12.7+ MB
'DataScience > Pandas' 카테고리의 다른 글
Pandas 카테고리컬, groupby(), 특정 데이터 가져오기 (0) | 2022.11.24 |
---|---|
Pandas NaN을 처리하는 전략 dropna(), fillna() (0) | 2022.11.24 |
Pandas 행, 열 추가, 데이터 삭제 drop(), rename(), 인덱스 초기화 reset_index(inplace= True) (0) | 2022.11.24 |
Pandas .iloc[ , ], 데이터 프레임에서 컬럼 만드는 방법 (0) | 2022.11.24 |
Pandas Dataframe, Nan의 의미, 데이터프레임 엑세스 (0) | 2022.11.23 |