Why does pathlib refuse byte strings?

您所在的位置：网站首页 › nct道英家很有钱吗 › Why does pathlib refuse byte strings?

Why does pathlib refuse byte strings?

2023-07-11 12:53| 来源: 网络整理| 查看: 265

Daniel R Stromberg:

Rosuav:

Your backup program shouldn’t even need to care about this distinction, unless it needs to filter based on such files (in which case their special encoding will have to factor into your filtering rules).

You give a somewhat convincing argument.

But what if you want to write xargs in Python?

That’s fair! xargs itself shouldn’t have a problem (since subprocess invocation accepts bytestrings for args), but perhaps you’re writing an archive extractor or something. The file names come from deep inside a binary file, and you want to faithfully recreate them. That CAN be done, but now you need to be explicit that you really do want to accept potentially-broken file names:

>>> fn = b"some-file-\x81-oopsie" >>> fn.decode("utf-8") Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 10: invalid start byte >>> fn.decode("utf-8", "surrogateescape") 'some-file-\udc81-oopsie'

This is a much less common use-case than the situations where you want real text in file names (file names are for humans, and humans want text that they can understand), so the way to do it does require a clear and intentional declaration that you want this surrogate handling.

【本文地址】

Why does pathlib refuse byte strings?

Why does pathlib refuse byte strings?

今日新闻

推荐新闻