11. Pandas的DataFrame的切片
在pandas里DataFrame[label]
或者DataFrame[index]
选择的是列。而DataFrame[start:end]
则是通过切片选择的是行。
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
print df1["bx"]
print df1["a" : "e"]
程序执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
a 11
b 16
c 21
d 26
e 31
f 36
g 41
h 46
i 51
j 56
Name: bx, dtype: int64
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
如果在[]里给出的是一个列表可以选择多列,实则是非切片。但给出两个列表却不能选择多行、多列。
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
print df1[["bx", "cx", "ex"]]
#print df1[["a","e"],["bx", "cx", "ex"]]
在DataFrame的[]里用切片很难选择多行多列数据,但DataFrame的loc、iloc等可以通过切片选择多行多列数据。
11.1 loc[]行列切片
loc[]里给出label的行、列的切片,可实现块选择。
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
print df1.loc["b" : "e", "bx" : "ex"]
程序执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
bx cx dx ex
b 16 17 18 19
c 21 22 23 24
d 26 27 28 29
e 31 32 33 34
11.2 iloc[]行列切片
在iloc里给出位置信息的行、列切片也可以实现块选择。
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
print df1.iloc[2 : 6, 2 : 4]
程序的执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
cx dx
c 22 23
d 27 28
e 32 33
f 37 38
11.3 ix[]的行列切片
ix[]里可以给出位置或者label的切片,即混合的切片,但行在前,列在后,可以实现块选择。
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
print df1.ix[2 : 6, "bx" : "ex"]
程序的执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
bx cx dx ex
c 21 22 23 24
d 26 27 28 29
e 31 32 33 34
f 36 37 38 39