Extract lines from file, add to new files
dn
PythonList at DancesWithMice.info
Sat Jan 13 17:01:12 EST 2024
On 12/01/24 08:53, Rich Shepard via Python-list wrote:
> On Thu, 11 Jan 2024, Piergiorgio Sartor via Python-list wrote:
>
>> Why not to use bash script for all?
>
> Piergiorgio,
>
> That's certainly a possibility, and may well be better than python for this
> task.
(sitting in a meeting with little to occupy my mind, whilst tidying
email-InBox came back to this conversation)
In the bare-description of the task, I might agree to sticking with
BASH. The OP did say that the output from this will become input to a
sed/mailx task!
(we trust, does not involve spamming innocent folk)
However, that task could also be accomplished in Python. So, unless
there is an existing script (perhaps) quite why one would choose to do
half in Python and half in BASH (or...) is a question.
Because this is a Python forum, do the whole thing in one mode - our mode!
Previous suggestions involved identifying a line by its content.
Could use a neat state-transition solution.
However, there is no need to consider the input-data as lines because of
the concept of "white-space", well-utilised by some of Python's built-in
string-functions. See code-sample, below.
As mentioned before, the idea of splitting the one file (data-items
related by serial-progression) and creating two quite-separate
data-constructs (in this case: one holding the person's name in one file
and the other the person's email-address in another) which are related
'across', ie line-by-line, is an architectural error?horror. Such would
be hard to maintain, and over-time impossible to guarantee integrity.
Assuming this is not a one-off exercise, see elsewhere for advice to
store the captured data in some more-useful format, eg JSON, CSV, or
even put into a MongoDB or RDBMS.
****** code
""" PythonExperiments:rich.py
Demonstrate string extraction.
"""
__author__ = "dn, IT&T Consultant"
__python__ = "3.12"
__created__ = "PyCharm, 14 Jan 2024"
__copyright__ = "Copyright © 2024~"
__license__ = "GNU General Public License v3.0"
# PSL
import more_itertools as it
DATA_FILE = "rich_data_file"
READ_ONLY = "r"
AS_PAIRS = 2
STRICT_PAIRING = True
if __name__ == "__main__":
print("\nCommencing execution\n")
with open( DATA_FILE, READ_ONLY, ) as df:
data = df.read()
data_as_list = data.split()
paired_data = it.chunked( data_as_list, AS_PAIRS, STRICT_PAIRING, )
for name, email_address in paired_data:
# replace this with email-function
# and/or with storage-function
print( name, email_address, )
print("\nTerminating")
****** sample output
Calvin calvin at example.com
Hobbs hobbs at some.com
...
******
--
Regards,
=dn
More information about the Python-list
mailing list