From barry at python.org  Wed Dec  5 01:12:01 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 4 Dec 2012 19:12:01 -0500
Subject: [Python-porting] main() -> Py_SetProgramName()
Message-ID: <20121204191201.3589a972@limelight.wooz.org>

One gotcha with porting embedded Python 3 is the mismatch between main()'s
signature and Py_SetProgramName() and PySys_SetArgv().

In Python 2, everything was easy.  You got char*'s from main() and could pass
them directly to these two calls.  Not in Python 3, because they now take
wchar_t*'s instead.  I get why these signatures have changed, but that doesn't
make life very easy for porters.

Take a look at main() in Modules/python.c to see the headaches Python itself
goes through do the conversions.  I think we're doing a disservice to
embedders not to provide convenience functions, alternative APIs, or at the
very least code examples for helping them do the argument conversions.  This
is not easy code, it's error prone, and folks shouldn't have to roll their own
every time they need to do this.

Using the algorithm in main() is probably not the best recommendation either,
because it uses non-public API methods such as _Py_char2wchar().  Perhaps
these should be promoted to a public method, or we should add a method to get
from main()'s char** to a wchar_t**.

For now, I've tried to use mbsrtowcs(), though I haven't done extensive
testing on the code.  I think Python ultimately uses mbstowcs() down deep in
its bowels.

There was some discussion of this back in 2009 IIRC, but nothing ever came of
it.  I think MvL at the time was against adding any convenience or alternative
API to Python.

Has anybody else encountered this while porting embedded Python applications
to Python 3?  How did you solve it?

I'm happy to bring this up on python-dev, but I also don't want to have to
wait until Python 3.4 to have a nice solution.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-porting/attachments/20121204/634f9c74/attachment.pgp>

From mal at egenix.com  Wed Dec  5 09:16:44 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 05 Dec 2012 09:16:44 +0100
Subject: [Python-porting] main() -> Py_SetProgramName()
In-Reply-To: <20121204191201.3589a972@limelight.wooz.org>
References: <20121204191201.3589a972@limelight.wooz.org>
Message-ID: <50BF02EC.9000804@egenix.com>

On 05.12.2012 01:12, Barry Warsaw wrote:
> One gotcha with porting embedded Python 3 is the mismatch between main()'s
> signature and Py_SetProgramName() and PySys_SetArgv().
> 
> In Python 2, everything was easy.  You got char*'s from main() and could pass
> them directly to these two calls.  Not in Python 3, because they now take
> wchar_t*'s instead.  I get why these signatures have changed, but that doesn't
> make life very easy for porters.
>
> Take a look at main() in Modules/python.c to see the headaches Python itself
> goes through do the conversions.  I think we're doing a disservice to
> embedders not to provide convenience functions, alternative APIs, or at the
> very least code examples for helping them do the argument conversions.  This
> is not easy code, it's error prone, and folks shouldn't have to roll their own
> every time they need to do this.
> 
> Using the algorithm in main() is probably not the best recommendation either,
> because it uses non-public API methods such as _Py_char2wchar().  Perhaps
> these should be promoted to a public method, or we should add a method to get
> from main()'s char** to a wchar_t**.
> 
> For now, I've tried to use mbsrtowcs(), though I haven't done extensive
> testing on the code.  I think Python ultimately uses mbstowcs() down deep in
> its bowels.

There's also another issue with the approach, since changing the
**argv from within Python is no longer possible on non-Windows
platforms.

This doesn't only affect embedded uses of Python, but all other
uses as well, e.g. it's no longer possible to change the ps output
under Unix for daemons and the like.

I think that we should have APIs going from the original char **argv
to the Py_Main() wchar_t **argv one, as well as APIs that allow
changing or at least accessing the original char **argv from within
Python (on non-Windows platforms).

That said, I don't think this is going to happen in a patch level
release...

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 05 2012)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-11-28: Released eGenix mx Base 3.2.5 ...     http://egenix.com/go36
2013-01-22: Python Meeting Duesseldorf ...                 48 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From martin at v.loewis.de  Wed Dec  5 13:53:01 2012
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Dec 2012 13:53:01 +0100
Subject: [Python-porting] main() -> Py_SetProgramName()
In-Reply-To: <20121204191201.3589a972@limelight.wooz.org>
References: <20121204191201.3589a972@limelight.wooz.org>
Message-ID: <50BF43AD.90401@v.loewis.de>

  > Using the algorithm in main() is probably not the best 
recommendation either,
> because it uses non-public API methods such as _Py_char2wchar().  Perhaps
> these should be promoted to a public method, or we should add a method to get
> from main()'s char** to a wchar_t**.
>
> For now, I've tried to use mbsrtowcs(), though I haven't done extensive
> testing on the code.  I think Python ultimately uses mbstowcs() down deep in
> its bowels.
>
> There was some discussion of this back in 2009 IIRC, but nothing ever came of
> it.  I think MvL at the time was against adding any convenience or alternative
> API to Python.

If I said that, I may not have meant it this way. I may have been 
opposed to a convenience function that implicitly calls setlocale, which
in turn would be necessary before mbsrtowcs can do anything useful
(for non-ASCII characters).

> I'm happy to bring this up on python-dev, but I also don't want to have to
> wait until Python 3.4 to have a nice solution.

In which case a stand-alone convenience function could be provided, to 
be included in every project facing this issue.

Regards,
Martin



From solipsis at pitrou.net  Sat Dec  8 10:40:32 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 8 Dec 2012 09:40:32 +0000 (UTC)
Subject: [Python-porting]
	=?utf-8?q?main=28=29_-=3E_Py=5FSetProgramName=28?= =?utf-8?q?=29?=
References: <20121204191201.3589a972@limelight.wooz.org>
	<50BF02EC.9000804@egenix.com>
Message-ID: <loom.20121208T104000-608@post.gmane.org>

M.-A. Lemburg <mal at ...> writes:
> 
> There's also another issue with the approach, since changing the
> **argv from within Python is no longer possible on non-Windows
> platforms.
> 
> This doesn't only affect embedded uses of Python, but all other
> uses as well, e.g. it's no longer possible to change the ps output
> under Unix for daemons and the like.

setproctitle is your friend:
http://pypi.python.org/pypi/setproctitle

Regards

Antoine.



From mal at egenix.com  Sat Dec  8 12:57:53 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 08 Dec 2012 12:57:53 +0100
Subject: [Python-porting] main() -> Py_SetProgramName()
In-Reply-To: <loom.20121208T104000-608@post.gmane.org>
References: <20121204191201.3589a972@limelight.wooz.org>
	<50BF02EC.9000804@egenix.com>
	<loom.20121208T104000-608@post.gmane.org>
Message-ID: <50C32B41.8000300@egenix.com>

On 08.12.2012 10:40, Antoine Pitrou wrote:
> M.-A. Lemburg <mal at ...> writes:
>>
>> There's also another issue with the approach, since changing the
>> **argv from within Python is no longer possible on non-Windows
>> platforms.
>>
>> This doesn't only affect embedded uses of Python, but all other
>> uses as well, e.g. it's no longer possible to change the ps output
>> under Unix for daemons and the like.
> 
> setproctitle is your friend:
> http://pypi.python.org/pypi/setproctitle

Thanks for the pointer, but I think this is more than enough
proof that something should be done to make the situations in
Py3 easier for everyone.

Here's the hack he's using to find the original argv areas
by walking backwards from environ[0]...

https://github.com/dvarrazzo/py-setproctitle/blob/master/src/spt_setup.c#L139

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2012)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-12-05: Released eGenix pyOpenSSL 0.13 ...    http://egenix.com/go37
2012-11-28: Released eGenix mx Base 3.2.5 ...     http://egenix.com/go36
2013-01-22: Python Meeting Duesseldorf ...                 45 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/