[Tutor] make a sqlite3 database from an ordinary text file
avi.e.gross at gmail.com
avi.e.gross at gmail.com
Sun Jun 19 23:03:15 EDT 2022
With all due respect, Manprit, your text file may not end in .csv or .dat
but is just as structured as files that are. Many such files now have no
funny headers or anything and can be edited using a flat text editor of any
kind and you can enter data one at a time on each line with your designated
delimiter (space, tab, comma, colon, whatever) between items and not at the
end of a line. If all lines have the same number
Of items and you save the file, you can often read it in no matter what the
extension is.
Nobody is arguing with you about this, but trying to educate you. Just about
any programming language, not just python, has available facilities to read
in such files for you or let you read it a line at a time and use some
built-in method to split the line by those separators.
If it turn out that your problem in reading it in using SQL or python is
that the name of the file ends in .txt, then do something to tell it what it
is in a way that over-rides the guess it makes OR rename or copy the file
(you can even do that in python) from xyy.txt to xyz.tsv or xyz.tab or
whatever makes the software happy!
I repeat. Most such files are indistinguishable from text files albeit there
may be some that add some kind of meta-data as something like comments that
software can use or ignore.
Most of the functionality people use to read in a file either lets you
specify that you expect the first line to be a header or that you do not
want it to be read. Some make a guess if the first line is not numeric and
the rest are.
In your case, I believe you have an answer already and maybe have not
figured it out.
If you want us to write the code for you, hire someone. We are here more for
hints and education.
But in English, you seem to want a multistep process and to do it YOURSELF.
Fine. Write your function and call it ONCE like this:
Col1Name, col2Name <- func("file.xt")
You now swallowed one line of header and have the text,
THEN you can call your other function and let it call func() in iteration
till done. Every call will return a tuple of strings that contain a number
as text.
You now have the names in one place and a growing list of "values" in
another. Presumably the way you call other code results in one place asking
to create or use a table with those column names, and the second keeps
inserting entries into that table by supplying data in the same order as the
columns.
I pointed out earlier that the code may not be right unless you made a
spelling mistake in showing it to us and you should read one line at a time
IN THE LOOP and not read one line once before. I am not convinced your
function is correct.
I am curious what you consider a big file. If your machine has enough
memory, you can process quite a large file and since python has garbage
collection, you can easily reclaim it by something as simple as putting
something new into the variable that held it, including deleting it
entirely.
I find your suggestion about python confusing. Your application is
straightforward to do in just about ANY language I have used and that is
well over a hundred I have studied. Feel free to use another, but NOT
because you claim it is hard in python. Mind you, ideas like generators are
not everywhere you look. In some more limited languages, you might simply
read a line at a time and perhaps open a temporary file in which you place
SQL statements to start the process, then in a loop read your .txt and
produce a line or two vaguely like this:
INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);
Finally you could throw that file at your SQL provider and wait a week for
it to finish.
Many SQL databases allow you to put in many at once with some syntax like
this:
INSERT INTO MyTable
( Column1, Column2, Column3 )
VALUES
('John', 123, 'Lloyds Office'),
('Jane', 124, 'Lloyds Office'),
('Billy', 125, 'London Office'),
('Miranda', 126, 'Bristol Office');
You can imagine making those by yourself.
But you are using some modules that do this for you and that probably is a
great idea. So work with it and see what the manual says is available. I
took a look as I have never had reason to use what you are talking about and
see your code:
cur.execute("create table workdata(Time INT, Pieces INT)")
The above is doing nothing but passing along what you created exactly. So
what you seem to want to do is take a string of text and interpolate the
words "Time" and "Pieces" into a template after reading them from the first
line of the text file, right? Python has at least 5 major ways to do that
and quite a few more. If you know how, fine. If not, ask a specific
question, please.
The next line of your code is:
cur.executemany("insert into workdata values (?, ?)",
data_generator("abcd.txt"))
The above is actually fairly simple shorthand for doing this by yourself in
pseudocode AFTER the table has been created using the first line of the
data-file:
Forever until out of data:
Col1, Col2 = next_line_read_split
MyCommand = interpolate Col1 and Col2 into "insert into workdata values
(%s, %s)" using one of many methods such as replacing the two instances of
%s
Run this: cur.execute(MyCommand)
The more complex loop you feed a generator to can be easily avoided if that
seems more complex. Heck, you do not need to have a generator at all, just
keep reading a line till done in the loop.
If the above makes no sense to you, fine. Ask specifically what step you do
not know how to do or give an example of your code and ask why what you did
is not working. Note the problem in cases like this often is not in the code
where you think but maybe in the code it calls.
If you want to solve a problem, this one is not hard in any number of ways.
If you want to solve it using cute shortcuts like the above that ask for a
generator, it may be a tad harder.
-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Manprit Singh
Sent: Sunday, June 19, 2022 9:00 PM
To: tutor at python.org
Subject: Re: [Tutor] make a sqlite3 database from an ordinary text file
Dear all,
Learning is a lifelong process . At this point i still do not feel i am
absolutely ok with python or programming. In the past i was very active in
this mailing list and will appreciate a few - Alan sir, Dennis Lee Bieber,
dn and few others, they always answered my queries .
As i wrote in the starting , i have a simple text file named abcd.txt. (I am
not using a csv file or any other structured file) . The contents are given
below :
Time Pieces
1 10
2 15
3 25
4 31
5 40
6 45
7 53
8 54
9 65
10 75
We can clearly see the first line of the file is
Time Pieces
Now I have to read this text into an sqlite3 database table , So it is
clear that the database table will have 2 columns, Column names of the
sqlite3 table are to be picked from the first line of the text file
abcd.txt. (First column name will be Time, second column name will be Pieces
).
See i know this can be done very easily using numpy and Pandas. But I am
trying to do it with Core Python only.
cur.execute("create table workdata(? INT, ? INT)", ("Time", "Pieces"))
I just want to insert column names (where these column names are read from
text file abcd.txt 1st line) using a question mark style placeholder. Pls
let me know if anything is possible .
Time Pieces
On Mon, Jun 20, 2022 at 5:34 AM <avi.e.gross at gmail.com> wrote:
> Manprit,
>
> You may be approaching this topic differently than you could.
>
> What you keep calling a TEXT file is what others might call structured
> files like .CSV or .DAT types that contain values separated by some
> delimiter like a comma or a tab or arbitrary whitespace or other
> choices. Lots of python functionalities are available to read these
> in to structures used in modules like numpy and pandas that many
> people use on top of or alongside regular built-in python commands. In
> these, there can be (and usually are) HEADER names for columns and
> sometimes also rows. The functionality that reads in such tables,
> Dataframes, or whatever you want to call it, will often optionally
> take the first line of the file to have headers specifying names. And,
> you can always add or change the names as well.
>
> So if passing the entire structure to another function, it can choose
> to access the name of any column (or often row) when it wants. Or, if
> you need to, you can ask for it and pass it your way as in
> interpolating the name into an SQL command.
>
> You may be asking a slightly different question here about just your data.
> Does your "text" file contain a first line saying ""Time Pieces" or
> not?
>
> If it DOES, you may need to read it ONCE at the start before giving
> your iterator for the numbers to be read and store the results as your
> names to use as you wish. If the file has no header, then your code is
> the one attaching a name.
>
> To me the real issue is your design seems to be of a limited nature.
> You want to read in your data in a small batch (single line) and
> presumably are opening a connection to the database and doing login
> and so on, then sending a command to create a table, unless it already
> exists, then sending umpteen commands to insert yet another row into
> the table and then who knows? This seems like such a common operation
> that I have to guess there is a way to do it fairly trivially and my
> guess is way more efficiently.
>
>
>
> -----Original Message-----
> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf
> Of Manprit Singh
> Sent: Sunday, June 19, 2022 2:38 PM
> Cc: tutor at python.org
> Subject: Re: [Tutor] make a sqlite3 database from an ordinary text
> file
>
> Dear sir ,
>
> One more query ....
> How can i use column name in text file as colum names in sqlite3 database
.
> I tried using python but failed.
>
> Can you put some light over it ?
> Regards
> Manprit Singh
>
> On Sun, 19 Jun, 2022, 23:36 Alan Gauld via Tutor, <tutor at python.org>
> wrote:
>
> > On 19/06/2022 18:11, Manprit Singh wrote:
> >
> > > cur.execute("create table workdata(Time INT, Pieces INT)")
> >
> > This will fail the second time it runs. You should check if the
> > table already exists and either drop it or use the existing table.
> >
> > However, I rarely create tables from Python, it's generally easier
> > to create a sql file and run that directly using the sqlite interpreter.
> > You only need Python if the format of the table has to be deduced
> > from the data. and that then leads to all sorts of follow-on issues
> > over accessing names of columns etc.
> >
> > Once you have the table you can then use Python to load the data,
> > although in your case you could do that directly from SQL too since
> > it looks like a tab separated file.
> >
> > As always use the most appropriate tool. If you can do it directly
> > in SQL there is no point in wrapping it in Python. That just adds
> > complexity and slows things down.
> >
> > --
> > Alan G
> > Author of the Learn to Program web site http://www.alan-g.me.uk/
> > http://www.amazon.com/author/alan_gauld
> > Follow my photo-blog on Flickr at:
> > http://www.flickr.com/photos/alangauldphotos
> >
> >
> > _______________________________________________
> > Tutor maillist - Tutor at python.org
> > To unsubscribe or change subscription options:
> > https://mail.python.org/mailman/listinfo/tutor
> >
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
_______________________________________________
Tutor maillist - Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list