pixiv downloader 20121108

Change log:

Add feature for download member’s bookmarked images.
Add new filename format for member’s bookmark mode.
- %bookmark% ==> for bookmark mode, add ‘Bookmarks’ string.
- %original_member_id% ==> for bookmark mode, put original member id.
- %original_member_token% ==> for bookmark mode, put original member token.
- %original_artist% ==> for bookmark mode, put original artist name.

Download link for pixiv downloader 20121108, source code in GitHub.

On other note, this blog reach 150k+ views 😀

28 thoughts on “pixiv downloader 20121108”

Mizu says:

May 25, 2013 at 12:31

Thanks for the great work >__< I'll be appreciate it ^^
Mylek says:

November 21, 2012 at 14:16

When running the client with -n 1 or numberofpage = 1 it seems to check the first two pages instead of just the first page.
1. nandaka says:
  
  November 21, 2012 at 15:35
  
  Can you give me the full command you are using?
  1. Mylek says:
    
    November 24, 2012 at 13:41
    
    PixivUtil2.exe -n 1
    Then from the menu I select 4 (download from list)
    
    I use list.txt to add artists that I follow and first download their gallery.
    Every week I run list download with -n 1 to get the recent updates.
    Checking the 2nd page slows down the update process since it isn’t needed.
    I’m guessing it’s as simple as having the loop that cycles through the pages stop one page sooner.
    (Like < instead of <=)
    
    Shouldn't affect it, but just in case here is my config file. I'm using a proxy and a custom naming format.
    
    [Settings]
    proxyaddress = 127.0.0.1:8123
    useproxy = True
    useragent = Mozilla/5.0 (X11; U; Unix i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1
    debughttp = False
    userobots = False
    filenameformat = %artist% (%member_id%)%urlFilename% – %title%
    filenamemangaformat = %artist% (%member_id%)%image_id% – %title%%urlFilename%
    timeout = 60
    uselist = True
    processfromdb = True
    overwrite = False
    tagsseparator = ,
    daylastupdated = 7
    rootdirectory = .
    retry = 3
    retrywait = 5
    createdownloadlists = False
    downloadlistdirectory = .
    irfanviewpath = C:Program FilesIrfanView
    startirfanview = False
    startirfanslide = False
    alwayscheckfilesize = False
    checkupdatedlimit = 0
    downloadavatar = True
    createmangadir = False
    usetagsasdir = False
    useblacklisttags = False
    usesuppresstags = False
    tagslimit = -1
    writeimageinfo = False
    
    [Pixiv]
    numberofpage = 0
    
    [Authentication]
    username = xxxxx
    password = xxxxx
    cookie = xxxxx
    usessl = False
    1. nandaka says:
      
      November 24, 2012 at 15:00
      
      You can use checkUpdatedLimit to skip after n number of images already downloaded. Try to set = 19 (1 page)
  2. Mylek says:
    
    November 25, 2012 at 17:52
    
    That does exactly what I need. I’ll just use that instead of the page option since it is a bit smarter. Thanks.
Anonymous says:

November 20, 2012 at 10:09

I try to download image 22073711 and I get:
Processing Image Id: 22073711
Image ID (22073711): ‘An error occurred!’
1. nandaka says:
  
  November 20, 2012 at 14:32
  
  Any error html generated? Can you give more details? Can you retry again?
  1. Anonymous says:
    
    November 20, 2012 at 17:25
    
    There is no “Error medium page for image ######.html” and no other message. The log looks like this:
    
    2012-11-20 04:21:04,108 – PixivUtil20121108 – INFO – Image id mode.
    2012-11-20 04:21:11,296 – PixivUtil20121108 – INFO – Image ID (22073711): ‘An error occurred!’
    
    what can I try?
2. nandaka says:
  
  November 21, 2012 at 06:52
  
  Found out the cause, I’ll fix it in weekend 😀
  1. Anonymous says:
    
    November 21, 2012 at 07:25
    
    cool, thanks!
Alexander says:

November 20, 2012 at 05:05

I’ve got some error message in linux using proxy (not only this image_id), did some search on google, but can’t figure out why…( link: http://bytes.com/topic/python/answers/31490-help-w-htmlparser-lib )

And these are the versions of the software:
python: Python 2.6.6
mechanize: 1.64-1
beautifulsoup: 3.1.0.1-2

Error message:
sh-4.1$ ./PixivUtil2.py -s 2 30000001
PixivDownloader2 version 20121108
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading /…/PixivUtil2-master/config.ini …
done.
Using proxy: 127.0.0.1:4321
Creating database… done.
Only process member where day last updated >= 7
Using Username: …
logging in with saved cookie
Trying to log with saved cookie
done.
Processing Image Id: 30000001
Traceback (most recent call last):
File “./PixivUtil2.py”, line 545, in processImage
parseMediumPage = BeautifulSoup(mediumPage.read())
File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1499, in __init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1230, in __init__
self._feed(isHTML=isHTML)
File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1263, in _feed
self.builder.feed(markup)
File “/usr/lib/python2.6/HTMLParser.py”, line 108, in feed
self.goahead(0)
File “/usr/lib/python2.6/HTMLParser.py”, line 150, in goahead
k = self.parse_endtag(i)
File “/usr/lib/python2.6/HTMLParser.py”, line 317, in parse_endtag
self.error(“bad end tag: %r” % (rawdata[i:j],))
File “/usr/lib/python2.6/HTMLParser.py”, line 115, in error
raise HTMLParseError(message, self.getpos())
HTMLParseError: bad end tag: u””, at line 78, column 114
Error at processImage(): (, HTMLParseError(), )
Dumping html to: Error Medium Page for image 30000001.html
Traceback (most recent call last):
File “./PixivUtil2.py”, line 1433, in main
menuDownloadByImageId(mode, opisvalid, args)
File “./PixivUtil2.py”, line 1106, in menuDownloadByImageId
processImage(mode, None, int(image_id))
File “./PixivUtil2.py”, line 545, in processImage
parseMediumPage = BeautifulSoup(mediumPage.read())
File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1499, in __init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1230, in __init__
self._feed(isHTML=isHTML)
File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1263, in _feed
self.builder.feed(markup)
File “/usr/lib/python2.6/HTMLParser.py”, line 108, in feed
self.goahead(0)
File “/usr/lib/python2.6/HTMLParser.py”, line 150, in goahead
k = self.parse_endtag(i)
File “/usr/lib/python2.6/HTMLParser.py”, line 317, in parse_endtag
self.error(“bad end tag: %r” % (rawdata[i:j],))
File “/usr/lib/python2.6/HTMLParser.py”, line 115, in error
raise HTMLParseError(message, self.getpos())
HTMLParseError: bad end tag: u””, at line 78, column 114

The error html file generated seems fine when opened by my browser.
Thanks :p
1. nandaka says:
  
  November 20, 2012 at 06:50
  
  Weird, I tried to run the command but it ran successfully.
  
  Can you try again? Most likely it caused by the proxy.
  1. Alexander says:
    
    November 20, 2012 at 13:33
    
    It maybe the proxy… but I also wonder what the error means…
    It said “HTMLParseError: bad end tag: u””, at line 78, column 114″, so i go to the html file get dumped, found this string:
    window.jQuery || document.write(”);
    
    Why would this possibly cause trouble ? this ‘
    Thanks 🙂
  2. Alexander says:
    
    November 20, 2012 at 13:37
    
    Don’t know why some string won’t show up in wordpress, here’s the missing string in pastebin:
    http://pastebin.com/raw.php?i=yzaAz7BX
    1. nandaka says:
      
      November 20, 2012 at 14:18
      
      Can you upload the whole html to mediafire? I have checked the original html from pixiv, and they also have those string inside the html. I can parse it just fine.
      
      Can you try to run the application without using proxy?
      
      EDIT: Just notice, you running from script, not compiled application. Can you update your mechanize version? See the readme for the recommended version:
      – Running from source code:
      – Python 2.7.2++
      – mechanize 0.2.5
      – BeautifulSoup 3.2.0
  3. Alexander says:
    
    November 20, 2012 at 18:40
    
    The html files:
    http://www.mediafire.com/?y1tf6674uoh7x9d
    
    And sadly I can’t connect to pixiv directly( probably they banned my IP address ).
    Yeah…I’ll try update my software as well. XD
Taro Kuroyoko says:

November 16, 2012 at 21:28

Question! Can you help me out with an error I am getting with this release?

I am trying to dl pictures using the first option (by pixiv user ID) and my output ends up like this:

—

Input: 1
Member id: 猫兎
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 猫兎
Reading C:Python27pixiv_utilityconfig.ini …
done.
Page 1
http://www.pixiv.net/member_illust.php?id=猫兎&p=1
‘NoneType’ object has no attribute ‘ul’
1 2 3 4
http://www.pixiv.net/member_illust.php?id=猫兎&p=1
‘NoneType’ object has no attribute ‘ul’
1 2 3 4
http://www.pixiv.net/member_illust.php?id=猫兎&p=1
‘NoneType’ object has no attribute ‘ul’
1 2 3 4
http://www.pixiv.net/member_illust.php?id=猫兎&p=1
‘NoneType’ object has no attribute ‘ul’
1

—

No .html error page or similar was generated in the directory, and no picture was downloaded either. Something else you might need to help me with this?
1. nandaka says:
  
  November 16, 2012 at 21:31
  
  Enter the ID, which is the numeric part from the url, not the artist name.
  For example: http://www.pixiv.net/member.php?id=27517 ==> 27517
2. Taro Kuroyoko says:
  
  November 16, 2012 at 22:08
  
  Gah! Nevermind, I googled a solution you gave someone else who had the same problem. I used the member name instead of the numbers in the url, so nevermind my query!
  
  Btw, now it works like a charm!
  1. Taro Kuroyoko says:
    
    November 16, 2012 at 22:08
    
    But thanks regardless! 😀
chaosscizzors says:

November 13, 2012 at 05:43

awesome app you have here. you have given me an excuse to start collecting artwork again. <3
1. Saya66 says:
  
  November 14, 2012 at 15:23
  
  Does this program report to you my password.. ?
  1. nandaka says:
    
    November 14, 2012 at 15:47
    
    Nope, you can check the source code in GitHub :D.
    
    As long you download the application from this site or my GitHub, then your password is save with you.
Zoram says:

November 8, 2012 at 21:21

With both the previous version and this one, I’m getting this error when I launch the program:

—
2012-11-08 14:19:15,046 – PixivUtil20121108 – INFO – Starting…
2012-11-08 14:19:15,078 – PixivUtil20121108 – INFO – Only process member where day last updated >= 7
2012-11-08 14:19:15,092 – PixivUtil20121108 – INFO – Using Username: cpgendo
2012-11-08 14:19:15,108 – PixivUtil20121108 – INFO – Log in using secure form.
2012-11-08 14:19:47,078 – PixivUtil20121108 – ERROR – Error at pixivLoginSSL(): (, <httperror_seek_wrapper (urllib2.HTTPError instance) at 0xe67ca8 whose wrapped object = <closeable_response at 0xe7b260 whose fp = <response_seek_wrapper at 0xe79ad0 whose wrapped object = <closeable_response at 0xe7b3c8 whose fp = >>>>, )
Traceback (most recent call last):
File “PixivUtil2.py”, line 303, in pixivLoginSSL
File “mechanize_mechanize.pyc”, line 203, in open
File “mechanize_mechanize.pyc”, line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 504: Gateway Time-out
2012-11-08 14:19:47,078 – PixivUtil20121108 – ERROR – Unknown Error: HTTP Error 504: Gateway Time-out
Traceback (most recent call last):
File “PixivUtil2.py”, line 1413, in main
File “PixivUtil2.py”, line 303, in pixivLoginSSL
File “mechanize_mechanize.pyc”, line 203, in open
File “mechanize_mechanize.pyc”, line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 504: Gateway Time-out
2012-11-08 14:19:52,062 – PixivUtil20121108 – INFO – EXIT
—

Since all worked fine until a couple of hours ago, it seems like a problem that has just popped in. The only way to solve it, right now, is to set “usessl = False”.
1. nandaka says:
  
  November 8, 2012 at 21:24
  
  Looks like your ISP got problem with pixiv server. I’m currently running the app with useSSL = True.
  
  Try to use proxy to connect pixiv or use different isp/pc or check your date/time settings.
  
  Python’s SSL is quite sensitive with the pc date/time.
  1. Zoram says:
    
    November 8, 2012 at 21:35
    
    The catch is, it happened on both my PCs, each of which as an Internet Key of a different ISP! But I will verify if that keeps happening (consider that some time ago, for a day, I was unable to use the downloader on the other PC for mysterious reasons – the day after, it was all back to normal).
    
    Is there a site where I can search for good proxies?
    1. nandaka says:
      
      November 8, 2012 at 21:55
      
      Google 😀 or try to use Tor-Vidalia bundle.

Comments are closed.