10赞

在Python中搜索并替换文件中的一行

作者：落单鸟人 | 2023-09-05 21:17

如何解决《在Python中搜索并替换文件中的一行》经验，为你挑选了8个好方法。

我想循环遍历文本文件的内容,并在某些行上进行搜索和替换,并将结果写回文件.我可以先将整个文件加载到内存中然后再写回来,但这可能不是最好的方法.

在以下代码中,最好的方法是什么？

f = open(file)
for line in f:
    if line.contains('foo'):
        newline = line.replace('foo', 'bar')
        # how to write this newline back to the file

Eli Bendersk.. 248

最短的方法可能是使用fileinput模块.例如,以下内容将行号添加到文件中:

import fileinput

for line in fileinput.input("test.txt", inplace=True):
    print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3
    # print "%d: %s" % (fileinput.filelineno(), line), # for Python 2

这里发生的是:

原始文件将移至备份文件

标准输出重定向到循环内的原始文件

因此,任何print语句都会写回原始文件

fileinput有更多的花里胡哨.例如,它可用于自动操作所有文件sys.args[1:],而无需显式迭代它们.从Python 3.2开始,它还提供了一个方便的上下文管理器,可以在with语句中使用.

虽然fileinput对于一次性脚本非常有用,但我会谨慎地在实际代码中使用它,因为不可否认它不是很易读或不熟悉.在实际(生产)代码中,花费更多代码行来使流程显式化并因此使代码可读是值得的.

有两种选择:

该文件不是太大,你可以完全读取它到内存.然后关闭文件,以写入模式重新打开它并将修改后的内容写回.

文件太大,无法存储在内存中; 您可以将其移动到临时文件并打开它,逐行读取,写回原始文件.请注意,这需要两倍的存储空间.

这里的关键位是print语句末尾的逗号:它压缩print语句添加另一个换行符(因为行已经有一个换行符).但这根本不是很明显(这就是为什么Python 3改变了语法,幸运的是). (32认同)

****DOES**写入文件.它将stdout重定向到文件.看看[docs](http://docs.python.org/library/fileinput.html) (14认同)

我知道这只有两行,但我认为代码本身并不具有表现力.因为如果你想一秒钟,如果你不知道这个功能,那么正在发生的事情很少有线索.打印行号和行与写它不一样......如果你得到我的要点...... (12认同)

对于python3,`print(line,end ='')` (5认同)

请注意,当您提供文件的打开挂钩时,例如当您尝试读/写UTF-16编码文件时,这不起作用. (4认同)

我同意.如何使用fileinput写入文件？ (3认同)

@bompf:这是[支持任意字符编码的基于NamedTemporaryFile的版本](http://stackoverflow.com/a/17222971/4279). (2认同)

Thomas Watne.. 181

我猜这样的事情应该这样做.它基本上将内容写入新文件并用新文件替换旧文件:

from tempfile import mkstemp
from shutil import move
from os import fdopen, remove

def replace(file_path, pattern, subst):
    #Create temp file
    fh, abs_path = mkstemp()
    with fdopen(fh,'w') as new_file:
        with open(file_path) as old_file:
            for line in old_file:
                new_file.write(line.replace(pattern, subst))
    #Remove original file
    remove(file_path)
    #Move new file
    move(abs_path, file_path)

只是一个小评论:`file`是阴影预定义的同名类. (5认同)

此代码更改原始文件的权限.如何保留原始权限？ (4认同)

@Wicelo您需要关闭它以防止文件描述符泄漏。这是一个不错的解释：http://www.logilab.org/17873 (2认同)

Jason.. 73

这是另一个经过测试的示例,它将匹配搜索和替换模式:

import fileinput
import sys

def replaceAll(file,searchExp,replaceExp):
    for line in fileinput.input(file, inplace=1):
        if searchExp in line:
            line = line.replace(searchExp,replaceExp)
        sys.stdout.write(line)

使用示例:

replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")

示例用法提供了一个正则表达式,但是`searchExp in line`和`line.replace`都不是正则表达式操作.当然使用示例是错误的. (23认同)

Kinlan.. 61

这应该工作:(现场编辑)

import fileinput

# Does a list of files, and
# redirects STDOUT to the file in question
for line in fileinput.input(files, inplace = 1): 
      print line.replace("foo", "bar"),

print添加了可能已存在的换行符.要避免这种情况,请在替换结束时添加.rstrip() (8认同)

+1.此外,如果您收到RuntimeError:input()已经处于活动状态,则调用fileinput.close() (5认同)

Thijs.. 23

根据Thomas Watnedal的回答.但是,这并没有完全回答原始问题的线到线部分.该功能仍然可以在线到线的基础上替换

此实现在不使用临时文件的情况下替换文件内容,因此文件权限保持不变.

另外,re.sub而不是replace,只允许正则表达式替换而不是纯文本替换.

将文件作为单个字符串而不是逐行读取允许多行匹配和替换.

import re

def replace(file, pattern, subst):
    # Read contents from file as a single string
    file_handle = open(file, 'r')
    file_string = file_handle.read()
    file_handle.close()

    # Use RE package to allow for replacement (also allowing for (multiline) REGEX)
    file_string = (re.sub(pattern, subst, file_string))

    # Write contents to file.
    # Using mode 'w' truncates the file.
    file_handle = open(file, 'w')
    file_handle.write(file_string)
    file_handle.close()

打开文件时,您可能希望使用`rb`和`wb`属性,因为这将保留原始行结尾 (2认同)

hamishmcn.. 11

正如lassevk建议的那样,随便写出新文件,这里有一些示例代码:

fin = open("a.txt")
fout = open("b.txt", "wt")
for line in fin:
    fout.write( line.replace('foo', 'bar') )
fin.close()
fout.close()

starryknight.. 11

如果你想要一个用一些其他文本替换任何文本的泛型函数,这可能是最好的方法,特别是如果你是正则表达式的粉丝:

import re
def replace( filePath, text, subs, flags=0 ):
    with open( filePath, "r+" ) as file:
        fileContents = file.read()
        textPattern = re.compile( re.escape( text ), flags )
        fileContents = textPattern.sub( subs, fileContents )
        file.seek( 0 )
        file.truncate()
        file.write( fileContents )

Kiran.. 9

更加pythonic的方式是使用上下文管理器,如下面的代码:

from tempfile import mkstemp
from shutil import move
from os import remove

def replace(source_file_path, pattern, substring):
    fh, target_file_path = mkstemp()
    with open(target_file_path, 'w') as target_file:
        with open(source_file_path, 'r') as source_file:
            for line in source_file:
                target_file.write(line.replace(pattern, substring))
    remove(source_file_path)
    move(target_file_path, source_file_path)

你可以在这里找到完整的片段.

1> Eli Bendersk..：

最短的方法可能是使用fileinput模块.例如,以下内容将行号添加到文件中:

import fileinput

for line in fileinput.input("test.txt", inplace=True):
    print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3
    # print "%d: %s" % (fileinput.filelineno(), line), # for Python 2

这里发生的是:

原始文件将移至备份文件

标准输出重定向到循环内的原始文件

因此,任何print语句都会写回原始文件

有两种选择:

该文件不是太大,你可以完全读取它到内存.然后关闭文件,以写入模式重新打开它并将修改后的内容写回.

文件太大,无法存储在内存中; 您可以将其移动到临时文件并打开它,逐行读取,写回原始文件.请注意,这需要两倍的存储空间.

这里的关键位是print语句末尾的逗号:它压缩print语句添加另一个换行符(因为行已经有一个换行符).但这根本不是很明显(这就是为什么Python 3改变了语法,幸运的是).

****DOES**写入文件.它将stdout重定向到文件.看看[docs](http://docs.python.org/library/fileinput.html)

我知道这只有两行,但我认为代码本身并不具有表现力.因为如果你想一秒钟,如果你不知道这个功能,那么正在发生的事情很少有线索.打印行号和行与写它不一样......如果你得到我的要点......

对于python3,`print(line,end ='')`

请注意,当您提供文件的打开挂钩时,例如当您尝试读/写UTF-16编码文件时,这不起作用.

我同意.如何使用fileinput写入文件？

@bompf:这是[支持任意字符编码的基于NamedTemporaryFile的版本](http://stackoverflow.com/a/17222971/4279).

2> Thomas Watne..：

我猜这样的事情应该这样做.它基本上将内容写入新文件并用新文件替换旧文件:

from tempfile import mkstemp
from shutil import move
from os import fdopen, remove

def replace(file_path, pattern, subst):
    #Create temp file
    fh, abs_path = mkstemp()
    with fdopen(fh,'w') as new_file:
        with open(file_path) as old_file:
            for line in old_file:
                new_file.write(line.replace(pattern, subst))
    #Remove original file
    remove(file_path)
    #Move new file
    move(abs_path, file_path)

只是一个小评论:`file`是阴影预定义的同名类.

此代码更改原始文件的权限.如何保留原始权限？

@Wicelo您需要关闭它以防止文件描述符泄漏。这是一个不错的解释：http://www.logilab.org/17873

3> Jason..：

这是另一个经过测试的示例,它将匹配搜索和替换模式:

import fileinput
import sys

def replaceAll(file,searchExp,replaceExp):
    for line in fileinput.input(file, inplace=1):
        if searchExp in line:
            line = line.replace(searchExp,replaceExp)
        sys.stdout.write(line)

使用示例:

replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")

示例用法提供了一个正则表达式,但是`searchExp in line`和`line.replace`都不是正则表达式操作.当然使用示例是错误的.

4> Kinlan..：

这应该工作:(现场编辑)

import fileinput

# Does a list of files, and
# redirects STDOUT to the file in question
for line in fileinput.input(files, inplace = 1): 
      print line.replace("foo", "bar"),

print添加了可能已存在的换行符.要避免这种情况,请在替换结束时添加.rstrip()

+1.此外,如果您收到RuntimeError:input()已经处于活动状态,则调用fileinput.close()

5> Thijs..：

根据Thomas Watnedal的回答.但是,这并没有完全回答原始问题的线到线部分.该功能仍然可以在线到线的基础上替换

此实现在不使用临时文件的情况下替换文件内容,因此文件权限保持不变.

另外,re.sub而不是replace,只允许正则表达式替换而不是纯文本替换.

将文件作为单个字符串而不是逐行读取允许多行匹配和替换.

import re

def replace(file, pattern, subst):
    # Read contents from file as a single string
    file_handle = open(file, 'r')
    file_string = file_handle.read()
    file_handle.close()

    # Use RE package to allow for replacement (also allowing for (multiline) REGEX)
    file_string = (re.sub(pattern, subst, file_string))

    # Write contents to file.
    # Using mode 'w' truncates the file.
    file_handle = open(file, 'w')
    file_handle.write(file_string)
    file_handle.close()

打开文件时,您可能希望使用`rb`和`wb`属性,因为这将保留原始行结尾

6> hamishmcn..：

正如lassevk建议的那样,随便写出新文件,这里有一些示例代码:

fin = open("a.txt")
fout = open("b.txt", "wt")
for line in fin:
    fout.write( line.replace('foo', 'bar') )
fin.close()
fout.close()

7> starryknight..：

如果你想要一个用一些其他文本替换任何文本的泛型函数,这可能是最好的方法,特别是如果你是正则表达式的粉丝:

import re
def replace( filePath, text, subs, flags=0 ):
    with open( filePath, "r+" ) as file:
        fileContents = file.read()
        textPattern = re.compile( re.escape( text ), flags )
        fileContents = textPattern.sub( subs, fileContents )
        file.seek( 0 )
        file.truncate()
        file.write( fileContents )

8> Kiran..：

更加pythonic的方式是使用上下文管理器,如下面的代码:

from tempfile import mkstemp
from shutil import move
from os import remove

def replace(source_file_path, pattern, substring):
    fh, target_file_path = mkstemp()
    with open(target_file_path, 'w') as target_file:
        with open(source_file_path, 'r') as source_file:
            for line in source_file:
                target_file.write(line.replace(pattern, substring))
    remove(source_file_path)
    move(target_file_path, source_file_path)

你可以在这里找到完整的片段.

推荐阅读

程序员
切换git分支时如何处理vim缓冲区？

如何解决《切换git分支时如何处理vim缓冲区？》经验，为你挑选了1个好方法。 ... [详细]
程序员
使用CSV和PowerSHell输出格式

如何解决《使用CSV和PowerSHell输出格式》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何通过刷出元素删除ListItem？

如何解决《如何通过刷出元素删除ListItem？》经验，为你挑选了0个好方法。 ... [详细]
程序员
"UserControl"类型不支持直接内容

如何解决《"UserControl"类型不支持直接内容》经验，为你挑选了3个好方法。 ... [详细]
程序员
从Go中的切片中删除字符串

如何解决《从Go中的切片中删除字符串》经验，为你挑选了1个好方法。 ... [详细]
程序员
当您无法提供色彩美感时,手动创建图例

如何解决《当您无法提供色彩美感时,手动创建图例》经验，为你挑选了0个好方法。 ... [详细]
程序员
Django持续时间字段具有负值

如何解决《Django持续时间字段具有负值》经验，为你挑选了0个好方法。 ... [详细]
程序员
Newtonsoft JsonSerializer - 小写属性和字典

如何解决《NewtonsoftJsonSerializer-小写属性和字典》经验，为你挑选了2个好方法。 ... [详细]
程序员
中心页脚UILabel分组UITableView - Swift

如何解决《中心页脚UILabel分组UITableView-Swift》经验，为你挑选了1个好方法。 ... [详细]
程序员
SQL Server sys.databases vs sysdatabases？

如何解决《SQLServersys.databasesvssysdatabases？》经验，为你挑选了1个好方法。 ... [详细]
程序员
为什么我们需要dnx或跨平台的网络

如何解决《为什么我们需要dnx或跨平台的网络》经验，为你挑选了0个好方法。 ... [详细]
程序员
两个同时发生的Ajax请求导致同一操作错误

如何解决《两个同时发生的Ajax请求导致同一操作错误》经验，为你挑选了1个好方法。 ... [详细]
程序员
Android camera2镜头内在校准

如何解决《Androidcamera2镜头内在校准》经验，为你挑选了1个好方法。 ... [详细]
程序员
具有"NaN"值的函数表的意外行为

如何解决《具有"NaN"值的函数表的意外行为》经验，为你挑选了2个好方法。 ... [详细]
程序员
如何在Vue.js中传递动态页面:id到$ http.get url

如何解决《如何在Vue.js中传递动态页面:id到$http.geturl》经验，为你挑选了1个好方法。 ... [详细]
程序员
谁首先在类加载过程中创建Class <？>对象？

如何解决《谁首先在类加载过程中创建Class<？>对象？》经验，为你挑选了2个好方法。 ... [详细]
程序员
Intellij Maven:创建包含所有库依赖项的jar

如何解决《IntellijMaven:创建包含所有库依赖项的jar》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何修复VS 2015 Update 1中的"内部诊断中心异常"？

如何解决《如何修复VS2015Update1中的"内部诊断中心异常"？》经验，为你挑选了3个好方法。 ... [详细]
程序员
无法使用Google App Engine上的Slim框架连接到Google Cloud SQL

如何解决《无法使用GoogleAppEngine上的Slim框架连接到GoogleCloudSQL》经验，为你挑选了1个好方法。 ... [详细]
程序员
Shell脚本错误:[:0:一元运算符预期

如何解决《Shell脚本错误:[:0:一元运算符预期》经验，为你挑选了1个好方法。 ... [详细]

落单鸟人

这个屌丝很懒，什么也没留下！

关注作者

Tags | 热门标签

RankList | 热门文章