Skip to content

Generation

tibiawikisql.generation

Functions related to generating a database dump from TibiaWiki.

Classes:

Name Description
Category

Defines the article groups to be fetched.

Functions:

Name Description
img_label

Get the label to show in progress bars when iterating images.

article_label

Get the label to show in progress bar when iterating articles.

constraint

Limit a string to a certain length if exceeded.

progress_bar

Get a progress bar iterator.

get_cache_info

Get a mapping of the last edit times of images stored in the cache.

save_cache_info

Store the last edit times of images stored in the cache.

fetch_image

Fetch an image from TibiaWiki and saves it the disk.

save_images

Fetch and save the images of articles of a certain category.

save_maps

Save the map files from TibiaMaps GitHub repository.

generate_outfit_image_names

Generate the list of image names to extract for outfits, as well as their parametrized information.

save_outfit_images

Save outfit images into the database.

fetch_category_entries

Fetch a list of wiki entries in a certain category.

fetch_images

Fetch all images for fetched articles.

generate_spell_offers

Fetch and save the spell offers from the spell data module.

Attributes:

Name Type Description
ConnCursor

Type alias for both sqlite3 Cursor or Connection objects.

OUTFIT_NAME_TEMPLATES

The templates for image filenames for outfits.

OUTFIT_ADDON_SEQUENCE

The sequence of addon types to iterate.

OUTFIT_SEX_SEQUENCE

The sequence of outfit sexes to iterate.

CATEGORIES

The categories to fetch and generate objects for.

ConnCursor module-attribute

ConnCursor = Cursor | Connection

Type alias for both sqlite3 Cursor or Connection objects.

OUTFIT_NAME_TEMPLATES module-attribute

OUTFIT_NAME_TEMPLATES = [
    "Outfit %s Male.gif",
    "Outfit %s Male Addon 1.gif",
    "Outfit %s Male Addon 2.gif",
    "Outfit %s Male Addon 3.gif",
    "Outfit %s Female.gif",
    "Outfit %s Female Addon 1.gif",
    "Outfit %s Female Addon 2.gif",
    "Outfit %s Female Addon 3.gif",
]

The templates for image filenames for outfits.

OUTFIT_ADDON_SEQUENCE module-attribute

OUTFIT_ADDON_SEQUENCE = (0, 1, 2, 3) * 2

The sequence of addon types to iterate.

OUTFIT_SEX_SEQUENCE module-attribute

OUTFIT_SEX_SEQUENCE = ['Male'] * 4 + ['Female'] * 4

The sequence of outfit sexes to iterate.

CATEGORIES module-attribute

CATEGORIES = {
    "achievements": Category(
        "Achievements", AchievementParser, no_images=True
    ),
    "spells": Category(
        "Spells", SpellParser, generate_map=True
    ),
    "items": Category(
        "Objects", ItemParser, generate_map=True
    ),
    "creatures": Category(
        "Creatures", CreatureParser, generate_map=True
    ),
    "books": Category(
        "Book Texts", BookParser, no_images=True
    ),
    "keys": Category("Keys", KeyParser, no_images=True),
    "npcs": Category("NPCs", NpcParser, generate_map=True),
    "imbuements": Category(
        "Imbuements", ImbuementParser, extension=".png"
    ),
    "quests": Category(
        "Quest Overview Pages", QuestParser, no_images=True
    ),
    "house": Category(
        "Player-Ownable Buildings",
        HouseParser,
        no_images=True,
    ),
    "charm": Category(
        "Charms", CharmParser, extension=".png"
    ),
    "outfits": Category(
        "Outfits", OutfitParser, no_images=True
    ),
    "worlds": Category(
        "Game Worlds",
        WorldParser,
        no_images=True,
        include_deprecated=True,
    ),
    "mounts": Category("Mounts", MountParser),
    "updates": Category(
        "Updates", UpdateParser, no_images=True
    ),
}

The categories to fetch and generate objects for.

Category

Category(
    name: str | None,
    parser: type[BaseParser],
    *,
    no_images: bool = False,
    extension: str = ".gif",
    include_deprecated: bool = False,
    generate_map: bool = False,
)

Defines the article groups to be fetched.

Class for internal use only, for easier autocompletion and maintenance.

Parameters:

Name Type Description Default
name str | None

The name of the TibiaWiki category containing the articles. Doesn't need the Category: prefix.

required
parser type[BaseParser]

The parser class to use.

required
no_images bool

Indicate that there is no image extraction from this category's items.

False
extension str

The filename extension for images.

'.gif'
include_deprecated bool

Whether to always include deprecated articles from this category.

False
generate_map bool

Whether to generate a mapping of article names to their article instance for later processing.

False
Source code in tibiawikisql/generation.py
def __init__(
        self,
        name: str | None,
        parser: type[BaseParser],
        *,
        no_images: bool = False,
        extension: str = ".gif",
        include_deprecated: bool = False,
        generate_map: bool = False,
) -> None:
    """Create a new instance of the class.

    Args:
        name: The name of the TibiaWiki category containing the articles. Doesn't need the `Category:` prefix.
        parser: The parser class to use.
        no_images: Indicate that there is no image extraction from this category's items.
        extension: The filename extension for images.
        include_deprecated: Whether to always include deprecated articles from this category.
        generate_map: Whether to generate a mapping of article names to their article instance for later processing.

    """
    self.name = name
    self.parser = parser
    self.no_images = no_images
    self.extension = extension
    self.include_deprecated = include_deprecated
    self.generate_map = generate_map

img_label

img_label(item: Image | None) -> str

Get the label to show in progress bars when iterating images.

Parameters:

Name Type Description Default
item Image | None

The image being iterated.

required

Returns:

Type Description
str

The name of the image's file or an empty string.

Source code in tibiawikisql/generation.py
def img_label(item: Image | None) -> str:
    """Get the label to show in progress bars when iterating images.

    Args:
        item: The image being iterated.

    Returns:
        The name of the image's file or an empty string.

    """
    if item is None:
        return ""
    return item.clean_name

article_label

article_label(item: Article | None) -> str

Get the label to show in progress bar when iterating articles.

Parameters:

Name Type Description Default
item Article | None

The article being iterated.

required

Returns:

Type Description
str

The name of the image's file or an empty string.

Source code in tibiawikisql/generation.py
def article_label(item: Article | None) -> str:
    """Get the label to show in progress bar when iterating articles.

    Args:
        item: The article being iterated.

    Returns:
        The name of the image's file or an empty string.

    """
    if item is None:
        return ""
    return constraint(item.title, 25)

constraint

constraint(value: str, limit: int) -> str

Limit a string to a certain length if exceeded.

Parameters:

Name Type Description Default
value str

The string to constraint the length of.

required
limit int

The length limit.

required

Returns:

Type Description
str

If the string exceeds the limit, the same string is returned, otherwise it is cropped.

Source code in tibiawikisql/generation.py
def constraint(value: str, limit: int) -> str:
    """Limit a string to a certain length if exceeded.

    Args:
        value: The string to constraint the length of.
        limit: The length limit.

    Returns:
        If the string exceeds the limit, the same string is returned, otherwise it is cropped.
    """
    if len(value) <= limit:
        return value
    return value[:limit - 1] + "…"

progress_bar

progress_bar(
    iterable: Iterable[V] | None = None,
    length: int | None = None,
    label: str | None = None,
    item_show_func: Callable[[V | None], str | None]
    | None = None,
    info_sep: str = "  ",
    width: int = 36,
) -> ProgressBar[V]

Get a progress bar iterator.

Source code in tibiawikisql/generation.py
def progress_bar(
        iterable: Iterable[V] | None = None,
        length: int | None = None,
        label: str | None = None,
        item_show_func: Callable[[V | None], str | None] | None = None,
        info_sep: str = "  ",
        width: int = 36,
) -> ProgressBar[V]:
    """Get a progress bar iterator."""
    return click.progressbar(
        iterable,
        length,
        label,
        True,
        True,
        True,
        item_show_func,
        "â–ˆ",
        "â–‘",
        f"%(label)s {Fore.YELLOW}%(bar)s{Style.RESET_ALL} %(info)s",
        info_sep,
        width,
    )

get_cache_info

get_cache_info(folder_name: str) -> dict[str, datetime]

Get a mapping of the last edit times of images stored in the cache.

Parameters:

Name Type Description Default
folder_name str

The name of the folder containing the stored images.

required

Returns:

Type Description
dict[str, datetime]

A dictionary, where each key is an image filename and its value is its upload date to the wiki.

Source code in tibiawikisql/generation.py
def get_cache_info(folder_name: str) -> dict[str, datetime.datetime]:
    """Get a mapping of the last edit times of images stored in the cache.

    Args:
        folder_name: The name of the folder containing the stored images.

    Returns:
        A dictionary, where each key is an image filename and its value is its upload date to the wiki.

    """
    try:
        with open(f"images/{folder_name}/cache_info.json") as f:
            data = json.load(f)
            return {k: datetime.datetime.fromisoformat(v) for k, v in data.items()}
    except (FileNotFoundError, json.JSONDecodeError):
        return {}

save_cache_info

save_cache_info(
    folder_name: str, cache_info: dict[str, datetime]
) -> None

Store the last edit times of images stored in the cache.

Parameters:

Name Type Description Default
folder_name str

The name of the folder containing the stored images.

required
cache_info dict[str, datetime]

A mapping of file names to their last upload date.

required
Source code in tibiawikisql/generation.py
def save_cache_info(folder_name: str, cache_info: dict[str, datetime.datetime]) -> None:
    """Store the last edit times of images stored in the cache.

    Args:
        folder_name: The name of the folder containing the stored images.
        cache_info: A mapping of file names to their last upload date.

    """
    with open(f"images/{folder_name}/cache_info.json", "w") as f:
        json.dump({k: v.isoformat() for k, v in cache_info.items()}, f)

fetch_image

fetch_image(
    session: Session, folder: str, image: Image
) -> bytes

Fetch an image from TibiaWiki and saves it the disk.

Parameters:

Name Type Description Default
session Session

The request session to use to fetch the image.

required
folder str

The folder where the images will be stored locally.

required
image Image

The image data.

required

Returns:

Type Description
bytes

The bytes of the image.

Source code in tibiawikisql/generation.py
def fetch_image(session: requests.Session, folder: str, image: Image) -> bytes:
    """Fetch an image from TibiaWiki and saves it the disk.

    Args:
        session:
            The request session to use to fetch the image.
        folder:
            The folder where the images will be stored locally.
        image:
            The image data.

    Returns:
        The bytes of the image.

    """
    r = session.get(image.file_url)
    r.raise_for_status()
    image_bytes = r.content
    with open(f"images/{folder}/{image.file_name}", "wb") as f:
        f.write(image_bytes)
    return image_bytes

save_images

save_images(
    conn: Connection, key: str, value: Category
) -> None

Fetch and save the images of articles of a certain category.

Parameters:

Name Type Description Default
conn Connection

Connection to the database.

required
key str

The name of the data store key to use.

required
value Category

The category of the images.

required
Source code in tibiawikisql/generation.py
def save_images(conn: sqlite3.Connection, key: str, value: Category) -> None:
    """Fetch and save the images of articles of a certain category.

    Args:
        conn: Connection to the database.
        key: The name of the data store key to use.
        value: The category of the images.

    """
    extension = value.extension
    table = value.parser.table.__tablename__
    column = "title"
    results = conn.execute(f"SELECT {column} FROM {table}")
    titles = [f"{r[0]}{extension}" for r in results]
    os.makedirs(f"images/{table}", exist_ok=True)
    cache_info = get_cache_info(table)
    cache_count = 0
    fetch_count = 0
    failed = []
    generator = wiki_client.get_images_info(titles)
    session = requests.Session()
    with (
        timed() as t,
        progress_bar(generator, len(titles), f"Fetching {key} images", item_show_func=img_label) as bar,
    ):
        for image in bar:  # type: Image
            if image is None:
                continue
            try:
                last_update = cache_info.get(image.file_name)
                if last_update is None or image.timestamp > last_update:
                    image_bytes = fetch_image(session, table, image)
                    fetch_count += 1
                    cache_info[image.file_name] = image.timestamp
                else:
                    with open(f"images/{table}/{image.file_name}", "rb") as f:
                        image_bytes = f.read()
                    cache_count += 1
            except FileNotFoundError:
                image_bytes = fetch_image(session, table, image)
                fetch_count += 1
                cache_info[image.file_name] = image.timestamp
            except requests.HTTPError:
                failed.append(image.file_name)
                continue
            conn.execute(f"UPDATE {table} SET image = ? WHERE {column} = ?", (image_bytes, image.clean_name))
        save_cache_info(table, cache_info)
    if failed:
        click.echo(f"{Style.RESET_ALL}\tCould not fetch {len(failed):,} images.{Style.RESET_ALL}")
        click.echo(f"\t-> {Style.RESET_ALL}{f'{Style.RESET_ALL},{Style.RESET_ALL}'.join(failed)}{Style.RESET_ALL}")
    click.echo(f"{Fore.GREEN}\tSaved {key} images in {t.elapsed:.2f} seconds."
               f"\n\t{fetch_count:,} fetched, {cache_count:,} from cache.{Style.RESET_ALL}")

save_maps

save_maps(con: ConnCursor) -> None

Save the map files from TibiaMaps GitHub repository.

Parameters:

Name Type Description Default
con ConnCursor

A connection or cursor to the database.

required
Source code in tibiawikisql/generation.py
def save_maps(con: ConnCursor) -> None:
    """Save the map files from TibiaMaps GitHub repository.

    Args:
        con: A connection or cursor to the database.

    """
    url = "https://tibiamaps.github.io/tibia-map-data/floor-{0:02d}-map.png"
    os.makedirs("images/map", exist_ok=True)
    for z in range(16):
        try:
            with open(f"images/map/{z}.png", "rb") as f:
                image = f.read()
        except FileNotFoundError:
            r = requests.get(url.format(z))
            r.raise_for_status()
            image = r.content
            with open(f"images/map/{z}.png", "wb") as f:
                f.write(image)
        except requests.HTTPError:
            continue
        con.execute("INSERT INTO map(z, image) VALUES(?,?)", (z, image))

generate_outfit_image_names

generate_outfit_image_names(
    rows: list[tuple[int, str]],
) -> tuple[list[str], dict[str, tuple[int, int, str]]]

Generate the list of image names to extract for outfits, as well as their parametrized information.

Parameters:

Name Type Description Default
rows list[tuple[int, str]]

A list of article ID and title pairs.

required

Returns:

Name Type Description
titles list[str]

A list of filenames to download.

image_info dict[str, tuple[int, int, str]]

A mapping of file names to their article ID, addon type and outfit sex.

Source code in tibiawikisql/generation.py
def generate_outfit_image_names(rows: list[tuple[int, str]]) -> tuple[list[str], dict[str, tuple[int, int, str]]]:
    """Generate the list of image names to extract for outfits, as well as their parametrized information.

    Args:
        rows: A list of article ID and title pairs.

    Returns:
        titles: A list of filenames to download.
        image_info: A mapping of file names to their article ID, addon type and outfit sex.

    """
    titles = []
    image_info = {}
    for article_id, name in rows:
        for i, image_name in enumerate(OUTFIT_NAME_TEMPLATES):
            file_name = image_name % name
            image_info[file_name] = (article_id, OUTFIT_ADDON_SEQUENCE[i], OUTFIT_SEX_SEQUENCE[i])
            titles.append(file_name)
    return titles, image_info

save_outfit_images

save_outfit_images(conn: ConnCursor) -> None

Save outfit images into the database.

Parameters:

Name Type Description Default
conn ConnCursor

A connection or cursor to the database.

required
Source code in tibiawikisql/generation.py
def save_outfit_images(conn: ConnCursor) -> None:
    """Save outfit images into the database.

    Args:
        conn: A connection or cursor to the database.

    """
    parser = OutfitParser
    table = parser.table.__tablename__
    os.makedirs(f"images/{table}", exist_ok=True)
    try:
        results = conn.execute(f"SELECT article_id, name FROM {table}")
    except sqlite3.Error:
        results = []
    if not results:
        return

    cache_info = get_cache_info(table)
    titles, image_info = generate_outfit_image_names(results)

    session = requests.Session()
    generator = wiki_client.get_images_info(titles)
    cache_count = 0
    fetch_count = 0
    failed = []
    with (
        timed() as t,
        progress_bar(generator, len(titles), "Fetching outfit images", item_show_func=img_label) as bar,
    ):
        for image in bar:
            if image is None:
                continue
            try:
                last_update = cache_info.get(image.file_name)
                if last_update is None or image.timestamp > last_update:
                    image_bytes = fetch_image(session, table, image)
                    fetch_count += 1
                    cache_info[image.file_name] = image.timestamp
                else:
                    with open(f"images/{table}/{image.file_name}", "rb") as f:
                        image_bytes = f.read()
                    cache_count += 1
            except FileNotFoundError:
                image_bytes = fetch_image(session, table, image)
                fetch_count += 1
                cache_info[image.file_name] = image.timestamp
            except requests.HTTPError:
                failed.append(image.file_name)
                continue
            article_id, addons, sex = image_info[image.file_name]
            conn.execute("INSERT INTO outfit_image(outfit_id, addon, sex, image) VALUES(?, ?, ?, ?)",
                         (article_id, addons, sex, image_bytes))
        save_cache_info(table, cache_info)
    if failed:
        click.echo(f"{Style.RESET_ALL}\tCould not fetch {len(failed):,} images.{Style.RESET_ALL}")
        click.echo(f"\t-> {Style.RESET_ALL}{f'{Style.RESET_ALL},{Style.RESET_ALL}'.join(failed)}{Style.RESET_ALL}")
    click.echo(f"{Fore.GREEN}\tSaved outfit images in {t.elapsed:.2f} seconds."
               f"\n\t{fetch_count:,} fetched, {cache_count:,} from cache.{Style.RESET_ALL}")

fetch_category_entries

fetch_category_entries(
    category: str, exclude_titles: set[str] | None = None
) -> list[WikiEntry]

Fetch a list of wiki entries in a certain category.

Parameters:

Name Type Description Default
category str

The name of the TibiaWiki category.

required
exclude_titles set[str] | None

Exclude articles matching these titles.

None

Returns:

Type Description
list[WikiEntry]

A list of entries contained in the category.

Source code in tibiawikisql/generation.py
def fetch_category_entries(category: str, exclude_titles: set[str] | None = None) -> list[WikiEntry]:
    """Fetch a list of wiki entries in a certain category.

    Args:
        category: The name of the TibiaWiki category.
        exclude_titles: Exclude articles matching these titles.

    Returns:
        A list of entries contained in the category.

    """
    click.echo(f"Fetching articles in {Fore.BLUE}Category:{category}{Style.RESET_ALL}...")
    entries = []
    with timed() as t:
        for entry in wiki_client.get_category_members(category):
            if exclude_titles and entry.title in exclude_titles:
                continue
            if entry.title.startswith("User:") or entry.title.startswith("TibiaWiki:"):
                continue
            entries.append(entry)
    click.echo(f"\t{Fore.GREEN}Found {len(entries):,} articles in {t.elapsed:.2f} seconds.{Style.RESET_ALL}")
    return entries

fetch_images

fetch_images(conn: Connection) -> None

Fetch all images for fetched articles.

Parameters:

Name Type Description Default
conn Connection

A connection to the database.

required
Source code in tibiawikisql/generation.py
def fetch_images(conn: sqlite3.Connection) -> None:
    """Fetch all images for fetched articles.

    Args:
        conn: A connection to the database.

    """
    with conn:
        for key, value in CATEGORIES.items():
            if value.no_images:
                continue
            save_images(conn, key, value)
        save_outfit_images(conn)
        save_maps(conn)

generate_spell_offers

generate_spell_offers(
    conn: Connection, data_store: dict[str, Any]
) -> None

Fetch and save the spell offers from the spell data module.

Parameters:

Name Type Description Default
conn Connection

A connection to the database.

required
data_store dict[str, Any]

The data store containing information about generated articles.

required
Source code in tibiawikisql/generation.py
def generate_spell_offers(conn: sqlite3.Connection, data_store: dict[str, Any]) -> None:
    """Fetch and save the spell offers from the spell data module.

    Args:
        conn: A connection to the database.
        data_store: The data store containing information about generated articles.

    """
    if "npcs_map" not in data_store or "spells_map" not in data_store:
        return
    article = wiki_client.get_article("Module:ItemPrices/spelldata")
    spell_offers = parse_spell_data(article.content)
    rows = []
    not_found_store = defaultdict(set)
    with (
        timed() as t,
        progress_bar(spell_offers, len(spell_offers), "Processing spell offers") as bar,
    ):
        for npc, spell, knight, paladin, druid, sorcerer, monk in bar:
            spell_id = data_store["spells_map"].get(spell.lower())
            if spell_id is None:
                not_found_store["spell"].add(spell)
                continue
            npc_id = data_store["npcs_map"].get(npc.lower())
            if npc_id is None:
                not_found_store["npc"].add(npc)
                continue
            rows.append((
                npc_id,
                spell_id,
                knight,
                sorcerer,
                paladin,
                druid,
                monk,
            ))
        with conn:
            conn.execute("DELETE FROM npc_spell")
            conn.executemany(
                "INSERT INTO npc_spell(npc_id, spell_id, knight, sorcerer, paladin, druid, monk) VALUES(?, ?, ?, ?, ?, ?, ?)",
                rows)
        if not_found_store["spell"]:
            unknonw_spells = not_found_store["spell"]
            click.echo(f"{Fore.RED}Could not parse offers for {len(unknonw_spells):,} spell.{Style.RESET_ALL}")
            click.echo(f"\t-> {Fore.RED}{f'{Style.RESET_ALL},{Fore.RED}'.join(unknonw_spells)}{Style.RESET_ALL}")
        if not_found_store["npc"]:
            unknown_npcs = not_found_store["npc"]
            click.echo(f"{Fore.RED}Could not parse offers of {len(unknown_npcs):,} npcs.{Style.RESET_ALL}")
            click.echo(f"\t-> {Fore.RED}{f'{Style.RESET_ALL},{Fore.RED}'.join(unknown_npcs)}{Style.RESET_ALL}")