1 多多使用列表生成式
替换下面代码:
- cube_numbers = []
- for n in range(0,10):
- if n % 2 == 1:
- cube_numbers.append(n**3)
为列表生成式写法:
- cube_numbers = [n**3 for n in range(1,10) if n%2 == 1]
2 内置函数
尽可能多使用下面这些内置函数:
3 尽可能使用生成器
单机处理较大数据量时,生成器往往很有用,因为它是分小片逐次读取,最大程度节省内存,如下网页爬取时使用yield。
- import requests
- import re
- def get_pages(link):
- pages_to_visit = []
- pages_to_visit.append(link)
- pattern = re.compile('https?')
- while pages_to_visit:
- current_page = pages_to_visit.pop(0)
- page = requests.get(current_page)
- for url in re.findall('<a href="([^"]+)">', str(page.content)):
- if url[0] == '/':
- url = current_page + url[1:]
- if pattern.match(url):
- pages_to_visit.append(url)
- # yield
- yield current_page
- webpage = get_pages('http://www.example.com')
- for result in webpage:
- print(result)
4 判断成员所属关系最快的方法使用 in
- for name in member_list:
- print('{} is a member'.format(name))
5 使用集合求交集
替换下面代码:
- a = [1,2,3,4,5]
- b = [2,3,4,5,6]
- overlaps = []
- for x in a:
- for y in b:
- if x==y:
- overlaps.append(x)
- print(overlaps)
修改为set和求交集:
- a = [1,2,3,4,5]
- b = [2,3,4,5,6]
- overlaps = set(a) & set(b)
- print(overlaps)
6 多重赋值
Python支持多重赋值的风格,要多多使用。
- first_name, last_name, city = "Kevin", "Cunningham", "Brighton"
7 尽量少用全局变量
Python查找最快、效率最高的是局部变量,查找全局变量相对变慢很多,因此多用局部变量,少用全局变量。
8 高效的itertools模块
itertools模块支持多个迭代器的操作,提供最节省内存的写法,因此要多多使用,如下求三个元素的全排列:
- import itertools
- iter = itertools.permutations(["Alice", "Bob", "Carol"])
- list(iter)
9 lru_cache 缓存
位于functools模块的lru_cache装饰器提供了缓存功能,如下结合它和递归求解斐波那契数列第n:
- import functools
- @functools.lru_cache(maxsize=128)
- def fibonacci(n):
- if n == 0:
- return 0
- elif n == 1:
- return 1
- return fibonacci(n - 1) + fibonacci(n-2)
因此,下面的递归写法非常低效,存在重复求解多个子问题的情况:
- def fibonacci(n):
- if n == 0: # There is no 0'th number
- return 0
- elif n == 1: # We define the first number as 1
- return 1
- return fibonacci(n - 1) + fibonacci(n-2)
10 内置函数、key和itemgetter
上面提到尽量多使用内置函数,如下对列表排序使用key,operator.itemgetter:
- import operator
- my_list = [("Josh", "Grobin", "Singer"), ("Marco", "Polo", "General"), ("Ada", "Lovelace", "Scientist")]
- my_list.sort(key=operator.itemgetter(0))
- my_list