🧑🏼 🏂🏿 🏍️ 我的电报频道@pythonetc的提示和技巧，2019年2月 🚣🏿 🔷 👩‍🍳

这是我的Telegram频道@pythonetc中关于Python和编程的新技巧和窍门。

以前的出版物。

结构比较

有时您想在测试中比较复杂的结构而忽略某些值。通常，可以通过将特定值与结构进行比较来完成：

>>> d = dict(a=1, b=2, c=3) >>> assert d['a'] == 1 >>> assert d['c'] == 3

但是，您可以创建特殊值，使其报告的值等于任何其他值：

 >>> assert d == dict(a=1, b=ANY, c=3)

通过定义__eq__方法可以很容易地做到这一点：

 >>> class AnyClass: ... def __eq__(self, another): ... return True ... >>> ANY = AnyClass()

sys.stdout是一个包装程序，允许您编写字符串而不是原始字节。使用sys.stdout.encoding自动对字符串进行编码：

 >>> _ = sys.stdout.write('Straße\n') Straße >>> sys.stdout.encoding 'UTF-8'

sys.stdout.encoding是只读的，等于Python的默认编码，可以通过设置PYTHONIOENCODING环境变量来更改它：

 $ PYTHONIOENCODING=cp1251 python3 Python 3.6.6 (default, Aug 13 2018, 18:24:23) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.stdout.encoding 'cp1251'

如果要向stdout写入字节，可以通过使用sys.stdout.buffer访问包装的缓冲区来绕过自动编码：

 >>> sys.stdout <_io.TextIOWrapper name='<stdout>' mode='w' encoding='cp1251'> >>> sys.stdout.buffer <_io.BufferedWriter name='<stdout>'> >>> _ = sys.stdout.buffer.write(b'Stra\xc3\x9fe\n') Straße

sys.stdout.buffer还是为您提供缓冲的包装器。可以通过使用sys.stdout.buffer.raw访问原始文件处理程序来绕过它：

 >>> _ = sys.stdout.buffer.raw.write(b'Stra\xc3\x9fe') Straße

椭圆常数

Python的内置常数很短。其中之一是Ellipsis ，也可以写成... 该常量对解释器没有特殊含义，但可以在看起来合适的语法中使用。

numpy以Ellipsis作为__getitem__参数，例如x[...]返回x所有元素。

PEP 484定义了其他含义： Callable[..., type]是定义未指定参数类型的可调用类型的一种方式。

最后，您可以使用...表示该功能尚未实现。这是一个完全有效的Python代码：

 def x(): ...

但是，在Python 2中， Ellipsis不能写成... 唯一的例外是表示a[Ellpsis] a[...] a[Ellpsis] 。

以下所有语法均适用于Python 3，但只有第一行适用于Python 2：

 a[...] a[...:2:...] [..., ...] {...:...} a = ... ... is ... def a(x=...): ...

重新导入模块

已经导入的模块将不会再次加载。 import foo什么也不做。但是，事实证明，在交互式环境中工作时重新导入模块很有用。在Python 3.4+中执行此操作的正确方法是使用importlib ：

 In [1]: import importlib In [2]: with open('foo.py', 'w') as f: ...: f.write('a = 1') ...: In [3]: import foo In [4]: foo.a Out[4]: 1 In [5]: with open('foo.py', 'w') as f: ...: f.write('a = 2') ...: In [6]: foo.a Out[6]: 1 In [7]: import foo In [8]: foo.a Out[8]: 1 In [9]: importlib.reload(foo) Out[9]: <module 'foo' from '/home/v.pushtaev/foo.py'> In [10]: foo.a Out[10]: 2

ipython还具有autoreload扩展名，可在必要时自动重新导入模块：

 In [1]: %load_ext autoreload In [2]: %autoreload 2 In [3]: with open('foo.py', 'w') as f: ...: f.write('print("LOADED"); a=1') ...: In [4]: import foo LOADED In [5]: foo.a Out[5]: 1 In [6]: with open('foo.py', 'w') as f: ...: f.write('print("LOADED"); a=2') ...: In [7]: import foo LOADED In [8]: foo.a Out[8]: 2 In [9]: with open('foo.py', 'w') as f: ...: f.write('print("LOADED"); a=3') ...: In [10]: foo.a LOADED Out[10]: 3

\ G

在某些语言中，可以使用\G断言。它在上次比赛结束的位置进行比赛。这样就可以编写有限的自动机，逐字遍历字符串（其中正则表达式定义了单词）。

但是，Python中没有这样的东西。正确的解决方法是手动跟踪位置并将子字符串传递给regex函数：

 import re import json text = '<a><b>foo</b><c>bar</c></a><z>bar</z>' regex = '^(?:<([az]+)>|</([az]+)>|([az]+))' stack = [] tree = [] pos = 0 while len(text) > pos: error = f'Error at {text[pos:]}' found = re.search(regex, text[pos:]) assert found, error pos += len(found[0]) start, stop, data = found.groups() if start: tree.append(dict( tag=start, children=[], )) stack.append(tree) tree = tree[-1]['children'] elif stop: tree = stack.pop() assert tree[-1]['tag'] == stop, error if not tree[-1]['children']: tree[-1].pop('children') elif data: stack[-1][-1]['data'] = data print(json.dumps(tree, indent=4))

在前面的示例中，我们可以避免一次又一次地对字符串进行切片，从而节省了一些时间，而是要求re模块从另一个位置开始搜索。

这需要一些更改。首先， re.search不支持从自定义位置搜索，因此我们必须手动编译正则表达式。其次， ^表示字符串的真正开始，而不是搜索开始的位置，因此我们必须手动检查匹配是否发生在同一位置。

 import re import json text = '<a><b>foo</b><c>bar</c></a><z>bar</z>' * 10 def print_tree(tree): print(json.dumps(tree, indent=4)) def xml_to_tree_slow(text): regex = '^(?:<([az]+)>|</([az]+)>|([az]+))' stack = [] tree = [] pos = 0 while len(text) > pos: error = f'Error at {text[pos:]}' found = re.search(regex, text[pos:]) assert found, error pos += len(found[0]) start, stop, data = found.groups() if start: tree.append(dict( tag=start, children=[], )) stack.append(tree) tree = tree[-1]['children'] elif stop: tree = stack.pop() assert tree[-1]['tag'] == stop, error if not tree[-1]['children']: tree[-1].pop('children') elif data: stack[-1][-1]['data'] = data def xml_to_tree_slow(text): regex = '^(?:<([az]+)>|</([az]+)>|([az]+))' stack = [] tree = [] pos = 0 while len(text) > pos: error = f'Error at {text[pos:]}' found = re.search(regex, text[pos:]) assert found, error pos += len(found[0]) start, stop, data = found.groups() if start: tree.append(dict( tag=start, children=[], )) stack.append(tree) tree = tree[-1]['children'] elif stop: tree = stack.pop() assert tree[-1]['tag'] == stop, error if not tree[-1]['children']: tree[-1].pop('children') elif data: stack[-1][-1]['data'] = data return tree _regex = re.compile('(?:<([az]+)>|</([az]+)>|([az]+))') def _error_message(text, pos): return text[pos:] def xml_to_tree_fast(text): stack = [] tree = [] pos = 0 while len(text) > pos: error = f'Error at {text[pos:]}' found = _regex.search(text, pos=pos) begin, end = found.span(0) assert begin == pos, _error_message(text, pos) assert found, _error_message(text, pos) pos += len(found[0]) start, stop, data = found.groups() if start: tree.append(dict( tag=start, children=[], )) stack.append(tree) tree = tree[-1]['children'] elif stop: tree = stack.pop() assert tree[-1]['tag'] == stop, _error_message(text, pos) if not tree[-1]['children']: tree[-1].pop('children') elif data: stack[-1][-1]['data'] = data return tree print_tree(xml_to_tree_fast(text))

结果：

 In [1]: from example import * In [2]: %timeit xml_to_tree_slow(text) 356 µs ± 16.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [3]: %timeit xml_to_tree_fast(text) 294 µs ± 6.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

圆形功能

今天的帖子由@itgram_channel的作者orsinium撰写。

round函数将数字四舍五入到给定的精度（以十进制数字表示）。

 >>> round(1.2) 1 >>> round(1.8) 2 >>> round(1.228, 1) 1.2

您还可以设置负精度：

 >>> round(413.77, -1) 410.0 >>> round(413.77, -2) 400.0

round返回输入数字类型的值：

 >>> type(round(2, 1)) <class 'int'> >>> type(round(2.0, 1)) <class 'float'> >>> type(round(Decimal(2), 1)) <class 'decimal.Decimal'> >>> type(round(Fraction(2), 1)) <class 'fractions.Fraction'>

对于您自己的类，您可以使用__round__方法定义舍入处理：

 >>> class Number(int): ... def __round__(self, p=-1000): ... return p ... >>> round(Number(2)) -1000 >>> round(Number(2), -2) -2

值四舍五入到10 ** (-precision)的最接近倍数。例如，对于precision=1值将四舍五入为0.1的倍数： round(0.63, 1)返回0.6 。如果两个倍数相等接近，则四舍五入取整为偶数选择：

 >>> round(0.5) 0 >>> round(1.5) 2

有时，浮点数的舍入可能会有些令人惊讶：

 >>> round(2.85, 1) 2.9

这是因为大多数十进制小数不能完全表示为浮点数（https://docs.python.org/3.7/tutorial/floatingpoint.html）：

 >>> format(2.85, '.64f') '2.8500000000000000888178419700125232338905334472656250000000000000'

如果要四舍五入，可以使用decimal.Decimal ：

 >>> from decimal import Decimal, ROUND_HALF_UP >>> Decimal(1.5).quantize(0, ROUND_HALF_UP) Decimal('2') >>> Decimal(2.85).quantize(Decimal('1.0'), ROUND_HALF_UP) Decimal('2.9') >>> Decimal(2.84).quantize(Decimal('1.0'), ROUND_HALF_UP) Decimal('2.8')

我的电报频道@pythonetc的提示和技巧，2019年2月

结构比较

椭圆常数

重新导入模块

\ G

圆形功能

More articles: