12. Pandas的DataFrame的布尔选择
DataFrame的行除了用之前的loc、iloc等,这些都是基于index做的相应的操作,而布尔序列值来选择非index列的值作用范围比loc等要宽泛一些、用途较为广泛,即给出一个布尔的列表来选择对应的行。
- 单列上的布尔选择
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
bs = df1["bx"] > 30
print df1[bs]
程序执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
ax bx cx dx ex
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
- 多列上布尔选择,布尔选择还可以进行逻辑上的组合
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
bs = (df1["bx"] > 30) & (df1["cx"] > 40)
print df1[bs]
程序执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
ax bx cx dx ex
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
布尔选择的结果还是DataFrame,所以对于结果可以进行切片、label、loc等访问。
import pandas as pd
import numpy as np
val = np.arange(10, 60).reshape(10, 5)
col = ["ax", "bx", "cx", "dx", "ex"]
idx = list("abcdefghij")
df1 = pd.DataFrame(val, columns = col, index = idx)
print "dataframe", "*" * 11
print df1
print "*" * 21, "<- dataframe"
bs = df1["bx"] > 30
print df1[bs]
print df1[bs][["ax", "ex"]]
print df1[bs]["e": "h"]
程序的执行结果:
dataframe ***********
ax bx cx dx ex
a 10 11 12 13 14
b 15 16 17 18 19
c 20 21 22 23 24
d 25 26 27 28 29
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
********************* <- dataframe
ax bx cx dx ex # print df1[bs]
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49
i 50 51 52 53 54
j 55 56 57 58 59
ax ex # print df1[bs][["ax", "ex"]]
e 30 34
f 35 39
g 40 44
h 45 49
i 50 54
j 55 59
ax bx cx dx ex # print df1[bs]["e": "h"]
e 30 31 32 33 34
f 35 36 37 38 39
g 40 41 42 43 44
h 45 46 47 48 49