Michael Foord wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">Facundo Batista wrote:<br>
> 2008/2/11, Jeff Younker <<a href="mailto:jeff@drinktomi.com">jeff@drinktomi.com</a>>:<br>><br>><br>>> enough. People don't read the documentation in enough detail. If<br>>> the library is leading to a problem, then the default for the library<br>
>> needs to be changed.<br>>><br>><br><br></div>My understanding is that it does have a default user-agent.(that looks<br>very similar to the one you suggest).<br><br>The only thing that would work (as far as I can tell), is to force the<br>
programmer to set an explicit user-agent rather than sending a default<br>one. This would obviously break a lot of code and there is no guarantee<br>that programmers would set it to anything useful...</blockquote><div><br>
I am firmly against removing the default User-Agent string in urllib.<br><br>I agree that forcing users to explicitly set a User-Agent string is the only way to get everyone using urllib to set their own User-Agent strings. But many applications will still end up using the same User-Agent strings (because underlying frameworks will set a default of their own). And, on the other hand, many users will set User-Agent strings which won't help contact the application's maintainer.<br>
<br>The beauty and usefulness of urllib rely on the fact that it Just Works. If I just want to retrieve a web page, a simple HTTP GET command, it should be _simple_. If urllib forces developers to set all sorts of things they don't know or care about, like User-Agent strings, they won't use it. Some of them will even leave Python and use a different language which has a friendlier library for this purpose.<br>
<br><br>But I'm not just going to be negative...<br><br>I propose that we update the docs (!!). Yes, I did read the post about why documentation doesn't help, but I think context matters here, as well as time spans.<br>
<br>Anyone doing more than just a few quick retrievals of web pages with urllib is bound to consult the docs sometime. If urllib's docs have a prominent note regarding setting the User-Agent, many of these people will notice it. I'm not talking about seasoned pros in the field who know the tool and have no reason to consult the docs; I'm talking about new users learning as their requirements expand, who will definitely consult the docs at some point.<br>
<br>Other than that, basic users who will only use urllib's most basic features (probably by copy-pasting code from some tutorial/blog/forum) will never set a meaningful User-Agent anyways, and forcing them to in any way will be counter-productive on our part.<br>
<br>As for existing applications/frameworks which don't set User-Agent strings, in the long run they will eventually notice a prominent note in the docs. To help things along in the short term, I think it would be better to reach them by conventional means than by breaking their code or otherwise forcing them to make such a change. ("conventional" meaning posts like these to mailing lists/forums, and perhaps a few on comp.lang.python and some blogs.)<br>
<br><br>- Tal<br></div></div>