Public Wiki Server
Full Wikipedia mirror for restricted regions — wiki.tyfsadik.org
Overview
This project serves a complete, compressed Wikipedia mirror at wiki.tyfsadik.org using Kiwix. Kiwix serves content from ZIM files — a custom compressed archive format designed for offline encyclopedias. A single Wikipedia ZIM file contains the full text and images of the English Wikipedia (~90 GB) in a format that Kiwix can serve article-by-article with full-text search, without unpacking the archive.
The primary purpose is to provide Wikipedia access to users in countries where the site is
restricted or intermittently blocked. The Kiwix server requires no JavaScript from external
sources, no CDN dependencies, and makes no outbound network requests during operation —
every response is served entirely from the local ZIM file. The stack is intentionally minimal:
a single kiwix-serve process behind an Nginx reverse proxy with Let's Encrypt SSL.
No database, no dynamic backend, no user tracking.
Architecture
Tech Stack
- Kiwix / kiwix-serve — ZIM file HTTP server with full-text search
- ZIM format — compressed archive for offline Wikipedia content
- Nginx — reverse proxy, SSL termination, access logging
- Let's Encrypt / Certbot — TLS certificate for wiki.tyfsadik.org
- systemd — service unit for kiwix-serve auto-start and restart
- Linux (Debian) — host operating system
- aria2c — resumable multi-connection download for the 90 GB ZIM file
Build Process
Install kiwix-tools
The kiwix-tools package provides kiwix-serve (the HTTP server) and
kiwix-manage (library management). On Debian, the package is available
in the standard repository. Alternatively, a static binary is downloaded from the
Kiwix releases page for systems where the packaged version is outdated.
apt install kiwix-tools -y
# Verify installation
kiwix-serve --version
# If package version is too old, use static binary:
wget https://download.kiwix.org/release/kiwix-tools/kiwix-tools_linux-x86_64.tar.gz
tar xf kiwix-tools_linux-x86_64.tar.gz
mv kiwix-tools_linux-x86_64/kiwix-serve /usr/local/bin/
chmod +x /usr/local/bin/kiwix-serve
Download Wikipedia ZIM File
The Wikipedia ZIM file is downloaded from the Kiwix library. Due to its size (~90 GB),
aria2c is used for resumable multi-connection download. The
kiwix-manage tool creates a library XML file that kiwix-serve
reads to locate the ZIM.
apt install aria2 -y
mkdir -p /var/lib/kiwix
# Download with aria2c: 4 connections, resumable
aria2c -x 4 -c -d /var/lib/kiwix \
https://download.kiwix.org/zim/wikipedia/wikipedia_en_all_maxi_2024-12.zim
# Verify file integrity
ls -lh /var/lib/kiwix/wikipedia_en_all_maxi_2024-12.zim
# Create Kiwix library
kiwix-manage /var/lib/kiwix/library.xml add \
/var/lib/kiwix/wikipedia_en_all_maxi_2024-12.zim
cat /var/lib/kiwix/library.xml
Create systemd Service for kiwix-serve
A systemd service unit ensures kiwix-serve starts on boot, restarts automatically on failure, and runs as an unprivileged user. The service is configured to listen on port 8080 (localhost only) and serve from the library XML file.
# /etc/systemd/system/kiwix-serve.service
[Unit]
Description=Kiwix ZIM HTTP Server
After=network.target
[Service]
Type=simple
User=www-data
Group=www-data
ExecStart=/usr/local/bin/kiwix-serve \
--library /var/lib/kiwix/library.xml \
--port 8080 \
--address 127.0.0.1 \
--threads 4
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
# Enable and start
systemctl daemon-reload
systemctl enable kiwix-serve
systemctl start kiwix-serve
systemctl status kiwix-serve
Nginx Reverse Proxy Configuration
Nginx proxies requests from wiki.tyfsadik.org to kiwix-serve on
localhost port 8080. The proxy timeout is set high enough to accommodate
full-text search queries, which can take several seconds on large ZIM files.
# /etc/nginx/sites-available/wiki.tyfsadik.org
server {
listen 80;
server_name wiki.tyfsadik.org;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name wiki.tyfsadik.org;
ssl_certificate /etc/letsencrypt/live/wiki.tyfsadik.org/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/wiki.tyfsadik.org/privkey.pem;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_read_timeout 60s;
}
}
ln -s /etc/nginx/sites-available/wiki.tyfsadik.org /etc/nginx/sites-enabled/
nginx -t && systemctl reload nginx
Provision SSL Certificate
Certbot provisions the Let's Encrypt certificate using the Nginx plugin, which also handles the HTTP-to-HTTPS redirect configuration automatically. The certificate auto-renews via a systemd timer.
apt install certbot python3-certbot-nginx -y
certbot --nginx -d wiki.tyfsadik.org \
--agree-tos --email [email protected]
# Confirm HTTPS is serving correctly
curl -IL https://wiki.tyfsadik.org
# Expected: HTTP/2 200
# Verify the cert details
certbot certificates
Test Article Access and Search
Article access and full-text search are tested directly against the kiwix-serve endpoint and through the Nginx proxy. The kiwix-serve API exposes a search endpoint that returns JSON, which can be tested with curl.
# Test direct access to kiwix-serve
curl http://127.0.0.1:8080/
# Test article access (URL is ZIM-internal path)
curl http://127.0.0.1:8080/wikipedia_en_all_maxi_2024-12/A/Python_(programming_language) \
| grep ""
# Test full-text search API
curl "http://127.0.0.1:8080/search?books=wikipedia_en_all_maxi_2024-12&lang=&pattern=linux" \
| head -50
# Verify through Nginx proxy
curl https://wiki.tyfsadik.org/ | grep "Kiwix"
# Monitor kiwix-serve logs
journalctl -u kiwix-serve -f
Data Flow
Challenges & Solutions
-
ZIM file download size and reliability: The 90 GB Wikipedia ZIM file took
over 12 hours to download on a typical VPS connection, and an interrupted download would
require starting over. Solved by using
aria2cwith the-cflag (continue interrupted downloads) and-x 4(4 parallel connections), which reduced total download time and allowed resuming after connection drops. -
Memory pressure from mmap on low-RAM server: kiwix-serve uses memory-mapped
I/O to access the ZIM file, which can cause the kernel to use most available memory for the
page cache. On a 4 GB RAM server this left little memory for Nginx and the OS. Tuned by
reducing
--threadsfrom the default 8 to 4, which reduced the number of simultaneous mmap operations and kept memory usage within acceptable bounds. -
kiwix-serve not starting after system reboot: The systemd unit was not
enabled, so it started manually but not on boot. Resolved by running
systemctl enable kiwix-serve, which creates the symlink in the appropriatewantsdirectory. - Full-text search returning no results for some queries: Kiwix full-text search is case-sensitive by default for exact article title matching. Users expecting Google-style fuzzy search were confused. Added a note to the landing page explaining the search behavior and linking to the ZIM file's built-in search page.
What I Learned
- ZIM file format: structure, article offsets, embedded full-text index, and compression
- Memory-mapped file I/O and its interaction with the Linux kernel page cache
- systemd service unit design for long-running single-binary services
- aria2c multi-connection resumable downloading for large files
- Serving static content without any database or dynamic backend at scale
- Access considerations for users in internet-restricted regions