前言:
請先閱讀: 我們為什麼要寫Pythonic程式碼? - 知乎專欄
正文:
本文件基於Raymond Hettinger 2013年在Pycon US的演講。 文件中的程式碼是基於Python2, 對於Python3改動的地方做了註釋。
YouTube傳送門:video
PPT傳送門:slides
使用xrange(py2)/ range(py3)
for i in [0, 1, 2, 3, 4, 5]:
print i**2
for i in range(6):
print i**2
Better
for i in xrange(6):
print i**2
xrange 會產生一個生成器, 與range相比更加節省記憶體空間。
xrange 在python3中重新命名為range。
遍歷collection
colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]
for i in range(len(colors)):
print colors[i]
Better
for color in colors:
print color
反向遍歷collection
colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]
for i in range(len(colors)-1, -1, -1):
print colors[i]
Better
for color in reversed(colors):
print color
遍歷collection中的元素與索引
colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]
for i in range(len(colors)):
print i, ‘——->’, colors[i]
Better
for i, color in enumerate(colors):
print i, ‘——->’, color
同時遍歷多個collection
names = [‘raymond’, ‘rachel’, ‘matthew’]
colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]
n = min(len(names), len(colors))
for i in range(n):
print names[i], ‘——->’, colors[i]
for name, color in zip(names, colors):
print name, ‘——->’, color
Better
for name, color in izip(names, colors):
print name, ‘——->’, color
zip 會生成一個新的列表, 會使用更多的記憶體。
izip 會生成一個生成器, 節省記憶體。
注: 在 python 3 izip 重新命名為 zip。
遍歷並排序collection
colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]
# Forward sorted order
for color in sorted(colors):
print colors
# Backwards sorted order
for color in sorted(colors, reverse=True):
print colors
自定義排序鍵
colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]
def compare_length(c1, c2):
if len(c1) < len(c2): return -1
if len(c1) > len(c2): return 1
return 0
print sorted(colors, cmp=compare_length)
Better
print sorted(colors, key=len)
comparison functions 在Python3中已經以及被取消了。
使用iter()連續呼叫函式
blocks = []
while True:
block = f。read(32)
if block == ‘’:
break
blocks。append(block)
Better
blocks = []
for block in iter(partial(f。read, 32), ‘’):
blocks。append(block)
iter 接受兩個引數時。 第一個引數是一個可呼叫物件(函式), 第二個引數是邊界值, 當可呼叫物件返回這個值時, 就會丟擲StopIteration
使用for/else
def find(seq, target):
found = False
for i, value in enumerate(seq):
if value == target:
found = True
break
if not found:
return -1
return i
Better
def find(seq, target):
for i, value in enumerate(seq):
if value == target:
break
else:
return -1
return i
注: 在這裡, Raymond 建議把else理解為 no break
遍歷字典中的鍵
d = {‘matthew’: ‘blue’, ‘rachel’: ‘green’, ‘raymond’: ‘red’}
for k in d:
print k
for k in d。keys():
if k。startswith(‘r’):
del d[k]
注: 不可以邊遍歷邊修改字典, 當你需要修改字典中的資料時, 你應該使用第二種方法。 第二種方法把字典中的鍵單獨提取出來, 並非遍歷字典。
注: 在Python3 中, 上述程式碼會丟擲:
RuntimeError: dictionary changed size during iteration
需要把字典中的鍵複製一份才可以。
同時遍歷字典中的鍵值對
# Not very fast, has to re-hash every key and do a lookup
for k in d:
print k, ‘——->’, d[k]
# Makes a big huge list
for k, v in d。items():
print k, ‘——->’, v
Better
for k, v in d。iteritems():
print k, ‘——->’, v
iteritems() 返回一個生成器 注: 在Python3中, 使用items()可以達到同樣的效果。
使用鍵值對生成字典
names = [‘raymond’, ‘rachel’, ‘matthew’]
colors = [‘red’, ‘green’, ‘blue’]
d = dict(izip(names, colors))
# {‘matthew’: ‘blue’, ‘rachel’: ‘green’, ‘raymond’: ‘red’}
注: python 3: d = dict(zip(names, colors))
使用字典進行計數
colors = [‘red’, ‘green’, ‘red’, ‘blue’, ‘green’, ‘red’]
# 適合新手的計數方法
d = {}
for color in colors:
if color not in d:
d[color] = 0
d[color] += 1
# {‘blue’: 1, ‘green’: 2, ‘red’: 3}
Better
d = {}
for color in colors:
d[color] = d。get(color, 0) + 1
d = defaultdict(int)
for color in colors:
d[color] += 1
使用字典進行分組
names = [‘raymond’, ‘rachel’, ‘matthew’, ‘roger’,
‘betty’, ‘melissa’, ‘judith’, ‘charlie’]
# In this example, we‘re grouping by name length
d = {}
for name in names:
key = len(name)
if key not in d:
d[key] = []
d[key]。append(name)
# {5: [’roger‘, ’betty‘], 6: [’rachel‘, ’judith‘], 7: [’raymond‘, ’matthew‘, ’melissa‘, ’charlie‘]}
d = {}
for name in names:
key = len(name)
d。setdefault(key, [])。append(name)
Better
d = defaultdict(list)
for name in names:
key = len(name)
d[key]。append(name)
popitem() 是原子操作
d = {’matthew‘: ’blue‘, ’rachel‘: ’green‘, ’raymond‘: ’red‘}
while d:
key, value = d。popitem()
print key, ’——>‘, value
popitem 是原子操作, 在多執行緒程式設計時無需加鎖。
連線多個字典
defaults = {’color‘: ’red‘, ’user‘: ’guest‘}
parser = argparse。ArgumentParser()
parser。add_argument(’-u‘, ’——user‘)
parser。add_argument(’-c‘, ’——color‘)
namespace = parser。parse_args([])
command_line_args = {k:v for k, v in vars(namespace)。items() if v}
# The common approach below allows you to use defaults at first, then override them
# with environment variables and then finally override them with command line arguments。
# It copies data like crazy, unfortunately。
d = defaults。copy()
d。update(os。environ)
d。update(command_line_args)
Better
d = ChainMap(command_line_args, os。environ, defaults)
ChainMap 在python3引進。
Improving Clarity
Positional arguments and indicies are nice
Keywords and names are better
The first way is convenient for the computer
The second corresponds to how human’s think
使用關鍵詞引數提高程式可讀性
twitter_search(’@obama‘, False, 20, True)
Better
twitter_search(’@obama‘, retweets=False, numtweets=20, popular=True)
使用命名元組返回更具可讀性的結果
# Old testmod return value
doctest。testmod()
# (0, 4)
# Is this good or bad? You don’t know because it‘s not clear。
Better
# New testmod return value, a namedTuple
doctest。testmod()
# TestResults(failed=0, attempted=4)
To make a namedTuple:
TestResults = namedTuple(’TestResults‘, [’failed‘, ’attempted‘])
解包
p = ’Raymond‘, ’Hettinger‘, 0x30, ’python@example。com‘
# A common approach / habit from other languages
fname = p[0]
lname = p[1]
age = p[2]
email = p[3]
Better
fname, lname, age, email = p
The second approach uses tuple unpacking and is faster and more readable。
解包多個變數
def fibonacci(n):
x = 0
y = 1
for i in range(n):
print x
t = y
y = x + y
x = t
Better
def fibonacci(n):
x, y = 0, 1
for i in range(n):
print x
x, y = y, x + y
使用原子操作更新變數
tmp_x = x + dx * t
tmp_y = y + dy * t
tmp_dx = influence(m, x, y, dx, dy, partial=’x‘)
tmp_dy = influence(m, x, y, dx, dy, partial=’y‘)
x = tmp_x
y = tmp_y
dx = tmp_dx
dy = tmp_dy
Better
x, y, dx, dy = (x + dx * t,
y + dy * t,
influence(m, x, y, dx, dy, partial=’x‘),
influence(m, x, y, dx, dy, partial=’y‘))
效率問題
總的來說, 不要產生不必要的資料。
拼接字串
names = [’raymond‘, ’rachel‘, ’matthew‘, ’roger‘,
’betty‘, ’melissa‘, ’judith‘, ’charlie‘]
s = names[0]
for name in names[1:]:
s += ’, ‘ + name
print s
Better
print ’, ‘。join(names)
使用合適的資料結構
names = [’raymond‘, ’rachel‘, ’matthew‘, ’roger‘,
’betty‘, ’melissa‘, ’judith‘, ’charlie‘]
del names[0]
# The below are signs you’re using the wrong data structure
names。pop(0)
names。insert(0, ‘mark’)
Better
names = deque([‘raymond’, ‘rachel’, ‘matthew’, ‘roger’,
‘betty’, ‘melissa’, ‘judith’, ‘charlie’])
# More efficient with deque
del names[0]
names。popleft()
names。appendleft(‘mark’)
裝飾器和上下文管理器
使用裝飾器代替管理操作
# Mixes business / administrative logic and is not reusable
def web_lookup(url, saved={}):
if url in saved:
return saved[url]
page = urllib。urlopen(url)。read()
saved[url] = page
return page
Better
@cache
def web_lookup(url):
return urllib。urlopen(url)。read()
注: python 3。2以後, 使用functools。lru_cache。
Factor-out temporary contexts
# Saving the old, restoring the new
old_context = getcontext()。copy()
getcontext()。prec = 50
print Decimal(355) / Decimal(113)
setcontext(old_context)
Better
with localcontext(Context(prec=50)):
print Decimal(355) / Decimal(113)
開啟以及關閉檔案
f = open(‘data。txt’)
try:
data = f。read()
finally:
f。close()
Better
with open(‘data。txt’) as f:
data = f。read()
如何使用鎖
# Make a lock
lock = threading。Lock()
# Old-way to use a lock
lock。acquire()
try:
print ‘Critical section 1’
print ‘Critical section 2’
finally:
lock。release()
Better
# New-way to use a lock
with lock:
print ‘Critical section 1’
print ‘Critical section 2’
使用ignored() 代替 pass exception
try:
os。remove(‘somefile。tmp’)
except OSError:
pass
Better
with ignored(OSError):
os。remove(‘somefile。tmp’)
使用上下文管理器減少臨時變數
# Temporarily redirect standard out to a file and then return it to normal
with open(‘help。txt’, ‘w’) as f:
oldstdout = sys。stdout
sys。stdout = f
try:
help(pow)
finally:
sys。stdout = oldstdout
Better
with open(‘help。txt’, ‘w’) as f:
with redirect_stdout(f):
help(pow)
編寫自己的redirect_stdout上下文管理器
@contextmanager
def redirect_stdout(fileobj):
oldstdout = sys。stdout
sys。stdout = fileobj
try:
yield fieldobj
finally:
sys。stdout = oldstdout
列表解析器以及生成器表示式
result = []
for i in range(10):
s = i ** 2
result。append(s)
print sum(result)
Better
print sum([i**2 for i in xrange(10)])
print sum(i**2 for i in xrange(10))