Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Encyclopedia:Database download
(section)
Project page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Should I get multistream?== {{tooltip|'''TL;DR'''|In summary}}: '''GET THE MULTISTREAM VERSION!''' (and the corresponding index file, ''pages-articles-multistream-index.txt.bz2'') ''pages-articles.xml.bz2'' and ''pages-articles-multistream.xml.bz2'' both contain the same ''xml'' contents. So if you unpack either, you get the same data. But with multistream, it is possible to get an article from the archive without unpacking the whole thing. Your reader should handle this for you; if your reader doesn't support it, it will work anyway since multistream and non-multistream contain the same ''xml''. The only downside to multistream is that it is marginally larger. You might be tempted to get the smaller non-multistream archive, but this will be useless if you don't unpack it. And it will unpack to ~5β10 times its original size. Penny wise, pound foolish. Get multistream. NOTE THAT the multistream dump file contains multiple bz2 'streams' (bz2 header, body, footer) concatenated together into one file, in contrast to the vanilla file which contains one stream. Each separate 'stream' (or really, file) in the multistream dump contains 100 pages, except possibly the last one. ===How to use multistream?=== For multistream, you can get an index file, ''pages-articles-multistream-index.txt.bz2''. The first field of this index is the number of bytes to seek into the compressed archive ''pages-articles-multistream.xml.bz2'', the second is the article ID, the third the article title. Cut a small part out of the archive with dd using the byte offset as found in the index. You could then either bzip2 decompress it or use bzip2recover, and search the first file for the article ID. See https://docs.python.org/3/library/bz2.html#bz2.BZ2Decompressor for info about such multistream files and about how to decompress them with python; see also https://gerrit.wikimedia.org/r/plugins/gitiles/operations/dumps/+/ariel/toys/bz2multistream/README.txt and related files for an old working toy. ===Other languages=== In the {{URL|//dumps.wikimedia.org/}} directory you will find the latest SQL and XML dumps for the projects, not just English. The sub-directories are named for the [[List_of_ISO_639-1_codes|language code]] and the appropriate project. Some other directories (e.g. simple, nostalgia) exist, with the same structure. These dumps are also available from the [[iarchive:wikimediadownloads|Internet Archive]].
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Encyclopedia:Database download
(section)
Add topic