pixiv downloader 20121108

Change  log:

  • Add feature for download member’s bookmarked images.
  • Add new filename format for member’s bookmark mode.
    • %bookmark% ==> for bookmark mode, add ‘Bookmarks’ string.
    • %original_member_id%    ==> for bookmark mode, put original member id.
    • %original_member_token% ==> for bookmark mode, put original member token.
    • %original_artist%       ==> for bookmark mode, put original artist name.

Download link for pixiv downloader 20121108, source code in GitHub.

On other note, this blog reach 150k+ views 😀

28 thoughts on “pixiv downloader 20121108”

  1. When running the client with -n 1 or numberofpage = 1 it seems to check the first two pages instead of just the first page.

      1. PixivUtil2.exe -n 1
        Then from the menu I select 4 (download from list)

        I use list.txt to add artists that I follow and first download their gallery.
        Every week I run list download with -n 1 to get the recent updates.
        Checking the 2nd page slows down the update process since it isn’t needed.
        I’m guessing it’s as simple as having the loop that cycles through the pages stop one page sooner.
        (Like < instead of <=)

        Shouldn't affect it, but just in case here is my config file. I'm using a proxy and a custom naming format.

        [Settings]
        proxyaddress = 127.0.0.1:8123
        useproxy = True
        useragent = Mozilla/5.0 (X11; U; Unix i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1
        debughttp = False
        userobots = False
        filenameformat = %artist% (%member_id%)%urlFilename% – %title%
        filenamemangaformat = %artist% (%member_id%)%image_id% – %title%%urlFilename%
        timeout = 60
        uselist = True
        processfromdb = True
        overwrite = False
        tagsseparator = ,
        daylastupdated = 7
        rootdirectory = .
        retry = 3
        retrywait = 5
        createdownloadlists = False
        downloadlistdirectory = .
        irfanviewpath = C:Program FilesIrfanView
        startirfanview = False
        startirfanslide = False
        alwayscheckfilesize = False
        checkupdatedlimit = 0
        downloadavatar = True
        createmangadir = False
        usetagsasdir = False
        useblacklisttags = False
        usesuppresstags = False
        tagslimit = -1
        writeimageinfo = False

        [Pixiv]
        numberofpage = 0

        [Authentication]
        username = xxxxx
        password = xxxxx
        cookie = xxxxx
        usessl = False

      2. That does exactly what I need. I’ll just use that instead of the page option since it is a bit smarter. Thanks.

  2. I try to download image 22073711 and I get:
    Processing Image Id: 22073711
    Image ID (22073711): ‘An error occurred!’

      1. There is no “Error medium page for image ######.html” and no other message. The log looks like this:

        2012-11-20 04:21:04,108 – PixivUtil20121108 – INFO – Image id mode.
        2012-11-20 04:21:11,296 – PixivUtil20121108 – INFO – Image ID (22073711): ‘An error occurred!’

        what can I try?

  3. I’ve got some error message in linux using proxy (not only this image_id), did some search on google, but can’t figure out why…( link: http://bytes.com/topic/python/answers/31490-help-w-htmlparser-lib )

    And these are the versions of the software:
    python: Python 2.6.6
    mechanize: 1.64-1
    beautifulsoup: 3.1.0.1-2

    Error message:
    sh-4.1$ ./PixivUtil2.py -s 2 30000001
    PixivDownloader2 version 20121108
    https://nandaka.wordpress.com/tag/pixiv-downloader/
    Reading /…/PixivUtil2-master/config.ini …
    done.
    Using proxy: 127.0.0.1:4321
    Creating database… done.
    Only process member where day last updated >= 7
    Using Username: …
    logging in with saved cookie
    Trying to log with saved cookie
    done.
    Processing Image Id: 30000001
    Traceback (most recent call last):
    File “./PixivUtil2.py”, line 545, in processImage
    parseMediumPage = BeautifulSoup(mediumPage.read())
    File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1499, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
    File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1230, in __init__
    self._feed(isHTML=isHTML)
    File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1263, in _feed
    self.builder.feed(markup)
    File “/usr/lib/python2.6/HTMLParser.py”, line 108, in feed
    self.goahead(0)
    File “/usr/lib/python2.6/HTMLParser.py”, line 150, in goahead
    k = self.parse_endtag(i)
    File “/usr/lib/python2.6/HTMLParser.py”, line 317, in parse_endtag
    self.error(“bad end tag: %r” % (rawdata[i:j],))
    File “/usr/lib/python2.6/HTMLParser.py”, line 115, in error
    raise HTMLParseError(message, self.getpos())
    HTMLParseError: bad end tag: u””, at line 78, column 114
    Error at processImage(): (, HTMLParseError(), )
    Dumping html to: Error Medium Page for image 30000001.html
    Traceback (most recent call last):
    File “./PixivUtil2.py”, line 1433, in main
    menuDownloadByImageId(mode, opisvalid, args)
    File “./PixivUtil2.py”, line 1106, in menuDownloadByImageId
    processImage(mode, None, int(image_id))
    File “./PixivUtil2.py”, line 545, in processImage
    parseMediumPage = BeautifulSoup(mediumPage.read())
    File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1499, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
    File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1230, in __init__
    self._feed(isHTML=isHTML)
    File “/usr/lib/pymodules/python2.6/BeautifulSoup.py”, line 1263, in _feed
    self.builder.feed(markup)
    File “/usr/lib/python2.6/HTMLParser.py”, line 108, in feed
    self.goahead(0)
    File “/usr/lib/python2.6/HTMLParser.py”, line 150, in goahead
    k = self.parse_endtag(i)
    File “/usr/lib/python2.6/HTMLParser.py”, line 317, in parse_endtag
    self.error(“bad end tag: %r” % (rawdata[i:j],))
    File “/usr/lib/python2.6/HTMLParser.py”, line 115, in error
    raise HTMLParseError(message, self.getpos())
    HTMLParseError: bad end tag: u””, at line 78, column 114

    The error html file generated seems fine when opened by my browser.
    Thanks :p

      1. It maybe the proxy… but I also wonder what the error means…
        It said “HTMLParseError: bad end tag: u””, at line 78, column 114″, so i go to the html file get dumped, found this string:
        window.jQuery || document.write(”);

        Why would this possibly cause trouble ? this ‘
        Thanks 🙂

        1. Can you upload the whole html to mediafire? I have checked the original html from pixiv, and they also have those string inside the html. I can parse it just fine.

          Can you try to run the application without using proxy?

          EDIT: Just notice, you running from script, not compiled application. Can you update your mechanize version? See the readme for the recommended version:
          – Running from source code:
          – Python 2.7.2++
          – mechanize 0.2.5
          – BeautifulSoup 3.2.0

  4. Question! Can you help me out with an error I am getting with this release?

    I am trying to dl pictures using the first option (by pixiv user ID) and my output ends up like this:

    Input: 1
    Member id: 猫兎
    Start Page (default=1):
    End Page (default=0, 0 for no limit):
    Processing Member Id: 猫兎
    Reading C:Python27pixiv_utilityconfig.ini …
    done.
    Page 1
    http://www.pixiv.net/member_illust.php?id=猫兎&p=1
    ‘NoneType’ object has no attribute ‘ul’
    1 2 3 4
    http://www.pixiv.net/member_illust.php?id=猫兎&p=1
    ‘NoneType’ object has no attribute ‘ul’
    1 2 3 4
    http://www.pixiv.net/member_illust.php?id=猫兎&p=1
    ‘NoneType’ object has no attribute ‘ul’
    1 2 3 4
    http://www.pixiv.net/member_illust.php?id=猫兎&p=1
    ‘NoneType’ object has no attribute ‘ul’
    1

    No .html error page or similar was generated in the directory, and no picture was downloaded either. Something else you might need to help me with this?

    1. Enter the ID, which is the numeric part from the url, not the artist name.
      For example: http://www.pixiv.net/member.php?id=27517 ==> 27517

    2. Gah! Nevermind, I googled a solution you gave someone else who had the same problem. I used the member name instead of the numbers in the url, so nevermind my query!

      Btw, now it works like a charm!

      1. Nope, you can check the source code in GitHub :D.

        As long you download the application from this site or my GitHub, then your password is save with you.

  5. With both the previous version and this one, I’m getting this error when I launch the program:


    2012-11-08 14:19:15,046 – PixivUtil20121108 – INFO – Starting…
    2012-11-08 14:19:15,078 – PixivUtil20121108 – INFO – Only process member where day last updated >= 7
    2012-11-08 14:19:15,092 – PixivUtil20121108 – INFO – Using Username: cpgendo
    2012-11-08 14:19:15,108 – PixivUtil20121108 – INFO – Log in using secure form.
    2012-11-08 14:19:47,078 – PixivUtil20121108 – ERROR – Error at pixivLoginSSL(): (, <httperror_seek_wrapper (urllib2.HTTPError instance) at 0xe67ca8 whose wrapped object = <closeable_response at 0xe7b260 whose fp = <response_seek_wrapper at 0xe79ad0 whose wrapped object = <closeable_response at 0xe7b3c8 whose fp = >>>>, )
    Traceback (most recent call last):
    File “PixivUtil2.py”, line 303, in pixivLoginSSL
    File “mechanize_mechanize.pyc”, line 203, in open
    File “mechanize_mechanize.pyc”, line 255, in _mech_open
    httperror_seek_wrapper: HTTP Error 504: Gateway Time-out
    2012-11-08 14:19:47,078 – PixivUtil20121108 – ERROR – Unknown Error: HTTP Error 504: Gateway Time-out
    Traceback (most recent call last):
    File “PixivUtil2.py”, line 1413, in main
    File “PixivUtil2.py”, line 303, in pixivLoginSSL
    File “mechanize_mechanize.pyc”, line 203, in open
    File “mechanize_mechanize.pyc”, line 255, in _mech_open
    httperror_seek_wrapper: HTTP Error 504: Gateway Time-out
    2012-11-08 14:19:52,062 – PixivUtil20121108 – INFO – EXIT

    Since all worked fine until a couple of hours ago, it seems like a problem that has just popped in. The only way to solve it, right now, is to set “usessl = False”.

    1. Looks like your ISP got problem with pixiv server. I’m currently running the app with useSSL = True.

      Try to use proxy to connect pixiv or use different isp/pc or check your date/time settings.

      Python’s SSL is quite sensitive with the pc date/time.

      1. The catch is, it happened on both my PCs, each of which as an Internet Key of a different ISP! But I will verify if that keeps happening (consider that some time ago, for a day, I was unable to use the downloader on the other PC for mysterious reasons – the day after, it was all back to normal).

        Is there a site where I can search for good proxies?

Comments are closed.