May 23

Don’t be NiceGUI

I have a Python CLI app that I enhanced to add a web UI, using NiceGUI. It worked very well, so I decided to do it again for another app that I’m working on right now. Of course, after months of doing the first app, I cant remember the steps, so I figured I’d document it this time. There will likely be some back-tracking, and re-writing, as I figure this out, so this won’t be a cook-book on how to do it, but by reading through the process, you should understand what to do.

 

About The App Being Modified

For shooting competitions, there is a web site, Practiscore, where people can find events, and then register for them.  Once approved, the participants can join one of the “squads”, which are limited in size. The site handles the whole registration process, and squadding process. Event coordinators can see the overall registration list, and the names of the people in each squad.

However, it is useful for an event coordinator to see things like, how many volunteer staff in each squad, whether participants are members of the hosting club, who have paid, who have not selected a squad yet, etc. The web site allows the coordinators to download a CSV file will all the registration information for each participant. There’s no API or other way to interact with the web site.

I created the “Squatter” app to read the CSV file, and allow the user to generate a variety of reports, grouped by squad, and displaying specific fields. The app allows the creation of “computed fields”, which will take a set of fields and apply some operation to them and produce a new field. As a simple example, a computed field “Full Name” can join the “First Name” and “Last Name” fields. Other computations can be “any”, “all”, and “has”.

For reporting, the app allows the user to “filter” the output based on a field (computed or plain), and a condition (equal, not-equal, starts-with).

A config file is used to pre-define some filters and computed field names, per event type.

Initially, the user will create an “event type”, by reading a CSV for a specific event. They would then create filters and computed fields as needed. Finally, they would download a current CSV for an event of that type and generate the desired reports.  Additional downloads may occur, as it nears the event date, to generate a report with the latest data.

Preliminaries

I’m using Python 3.13.1, git for version control, and uv for pacakage management. I was using poetry, but found uv to be much faster. With uv, I have the pyproject.toml file configured to run my CLI app via an alias:

[project.scripts]

squatter = “squatter.cli.main:entry_point”

The layout of the project is:

squatter
├── __init__.py
├── cli
│   ├── __init__.py
│   ├── computed_field.py
│   ├── event.py
│   ├── filter.py
│   ├── generate.py
│   ├── main.py
│   └── template.py
├── config.py
├── csv_render.py
├── database.py
├── domain
│   ├── exceptions.py
│   ├── filter.py
│   └── types.py
├── edit_field.py
├── edit_template.py
├── excel_render.py
├── loader.py
├── models.py
├── report.py
├── repos
│   ├── event.py
│   ├── filter.py
│   └── template.py
├── rich_render.py
├── services
│   ├── base.py
│   ├── computed_field_service.py
│   ├── event_service.py
│   ├── filter_service.py
│   ├── report_service.py
│   └── template_service.py
└── utils.py

I added the NiceGUI package to the project with “uv add nicegui”.

 

Planning

There are some major elements in the app:

  • EventType – Definition for a group of similar events. When created, a CSV file is loaded for an event of that type to define the fields available. Creation will also load a predefined set of ReportFilters and computed EventFields from a configuration file, based on the event type name entered.
  • EventTemplate – User defined template for a specific event type, specifying which fields will be displayed for a specific report.  EventFields from the EventType can be included/excluded in/from the displayed fields.
  • EventField – defined when an EventType is created. Has name, displayed name, type (“single” or one of the computed types), and separator and pattern characters that are used when the field represents a computed field.  For computed fields, the type can be “any”, “all”, “join”, or “has” and the name will be EventField names separated by “|”. For example, there can be a “Full Name” computed field with type “join”, ” ” separator, and name “First Name|Last Name” so that it is the combination of two EventFields.
  • DisplayField – Holds supplemental info for an EventField that is being displayed. Column order for the report is specified (1-N), and justification and width restrictions can be specified.
  • ReportFilter – Loaded when the event is created and consists of a name, condition (equal, not-equal, starts-with), value, and the display name of the EventField (which can be a computed field) to be filtered.

A key thing here is that the EventType is a key element. Every other element is associated with a specific EventType. With the CLI, many commands would specify the event type. For the web interface, I decided that one would select an event type (keeping the state), and then all the other operations would work with that event type.

One of the big hurdles is to figure out how to design the “look” of the web interface. I’m no expert in this area, and just did a bunch of sketches on how I think pages should look and how to perform the actions equivalent to the CLI version. I asked Grok AI for some ideas as well, and to provide examples of how to generate the UI components with NiceGUI.

The thought was to have a main menu for event types, templates, computed fields, filters, and report generation. Initially, only the event type page would be available. Once an event type is created and selected, the other menus would become available.

This is clearly written in stone, and will definitely not change… yeah right.

 

Getting a basic web page running

I created a ui directory, with a pages sub-directory to hold web code as I develop it. In the ui directory I created main.py and __init__.py. The key elements to it are:

def run_app(reload: bool=False):
"""Entry point used by 'uv run web' and direct execution"""
setup()
ui.run(
title="Squatter",
port=8080,
reload=reload,
show=True,
)


# This allows both "uv run web" and "python -m squatter.ui.main"
if __name__ in {"__main__", "__mp_main__"}:
run_app(reload=False)

 

I had this code, along with some @ui.page definitions recommended by Grok, and a page layout function:

@ui.page("/")
def home():
    ui.navigate.to("/event-types")


@ui.page("/event-types")
def event_types_page():
    main_layout("Event Types")
    ui.label("Event Types management coming soon...").classes("text-h6 m-8")

...
def main_layout(page_title: str = "Squatter Reports"):
    global left_drawer

# Header
with ui.header().classes("items-center justify-between"):
with ui.row().classes("items-center gap-4"):
ui.button(icon="menu", on_click=lambda: left_drawer.toggle() if left_drawer else None).props("flat round")
ui.label(page_title).classes("text-h6 font-bold")

with ui.row().classes("items-center gap-3"):
ui.label("Active Event:").classes("text-gray-400 text-sm")
ui.select(
options=event_type_options,
value=active_event_type,
on_change=set_active_event,
label="Select event type",
).classes("min-w-72").props("outlined dense")

ui.button(
"Manage Events",
on_click=lambda: ui.navigate.to("/event-types"),
icon="settings"
).props("flat")

with ui.row().classes("items-center gap-2"):
ui.button(icon="person", on_click=lambda: ui.notify("User menu coming soon")).props("flat round")

# Left Drawer (created once)
if left_drawer is None:
with ui.left_drawer(value=True, fixed=True).classes("bg-gray-50 dark:bg-gray-900") as left_drawer:
with ui.column().classes("w-full p-4 gap-1"):
ui.label("SQUATTER").classes("text-2xl font-bold text-primary mx-2 my-4")
ui.separator()

def nav_item(label: str, target: str, icon: Optional[str] = None):
return (
ui.button(label, on_click=lambda: ui.navigate.to(target))
.props("flat align=left")
.classes("w-full justify-start")
.props(f"icon={icon}" if icon else "")
)

nav_item("Event Types", "/event-types", "list")
nav_item("Templates", "/templates", "article")
nav_item("Computed Fields", "/computed-fields", "functions")
nav_item("Filters", "/filters", "filter_list")
ui.separator()
nav_item("Generate Report", "/generate-report", "play_arrow")

# Footer
with ui.footer().classes("bg-transparent text-xs text-gray-500 justify-center"):
ui.label("CLI power users welcome • Built with NiceGUI")

In pyproject.toml, I added a new script alias, so I can run the web app:

[project.scripts]

squatter = “squatter.cli.main:entry_point”

web = “squatter.ui.main:run_app”

 

Note: Initially, when I tried this using the alias, I was getting a Runtime error. By running “uv run python squatter/ui/main.py” it worked. After numerous tries with Grok to resolve, and looking at my other project that worked, I couldn’t figure out what was wrong. I asked Gemini, and it immediately mentioned that run_app() had reload with default of True, and when run via the aliases that value is used. I guess NiceGUI starts tracking the app, when it sees the @ui.page decorators, and when ui.run() tries to spawn another child process, we get an error. Changing the default to “False” for the reload argument, it worked.

 

Working on the “look”

It took a bunch of iterations with Grok suggestions to get a page layout that seems reasonable. This has a fixed page layout with side bar on left with menu, content area, and future user login button.  Some of it will be removed (button for event management at top), and some is for future additions (e.g. user)

Here is the main.py content in full:

"""Squatter - Single Page App Version"""

from typing import Optional

from nicegui import ui

# Global state
active_event_type: Optional[str] = None
event_type_options: list[str] = []

# Container references
left_drawer = None
main_content = None

def refresh_event_options():
    global event_type_options
    event_type_options[:] = [
        "Squad Training",
        "Officer Safety",
        "Range Qualification",
    ]

def set_active_event(e):
    global active_event_type
    if e.value:
        active_event_type = e.value
        ui.notify(f"Active Event set to: {e.value}", type="positive")

def show_page(page_name: str):
    """Update only the content area"""
    if main_content is None:
        return
    main_content.clear()
    with main_content:
        if page_name == "event-types":
            ui.label("Event Types management coming soon...").classes("text-h5")
        elif page_name == "templates":
            ui.label(f"Templates - {active_event_type or 'No Event Selected'}").classes(
                "text-h5"
            )
        elif page_name == "computed-fields":
            ui.label(
                f"Computed Fields - {active_event_type or 'No Event Selected'}"
            ).classes("text-h5")
        elif page_name == "filters":
            ui.label(f"Filters - {active_event_type or 'No Event Selected'}").classes(
                "text-h5"
            )
        elif page_name == "generate-report":
            ui.label("Generate Report coming soon...").classes("text-h5")

def nav_item(label: str, page: str, icon: Optional[str] = None, disabled: bool = False):
    btn = (
        ui.button(
            label,
            on_click=lambda: show_page(page) if not disabled else None,
        )
        .props("flat align=left")
        .classes("w-full justify-start text-white")
    )
    if icon:
        btn.props(f"icon={icon}")
    if disabled:
        btn.classes("text-gray-500 opacity-50 pointer-events-none")
    return btn

def build_ui():
    """Build header + drawer + content container"""
    global left_drawer, main_content

    # Header
    with ui.header().classes("items-center justify-between bg-primary text-white"):
        with ui.row().classes("items-center gap-4"):
            ui.button(
                icon="menu",
                on_click=lambda: left_drawer.toggle() if left_drawer else None,
            ).props("flat round color=white")
            ui.label("Squatter Reports").classes("text-h6 font-bold")

        with ui.row().classes("items-center gap-3"):
            ui.label("Active Event:").classes("text-white")
            ui.select(
                options=event_type_options,
                value=active_event_type,
                on_change=set_active_event,
                label="Select event type",
            ).classes("min-w-72").props("outlined dense")

            ui.button(
                "Manage Events",
                on_click=lambda: show_page("event-types"),
                icon="settings",
            ).props("flat color=white")

        with ui.row().classes("items-center gap-2"):
            ui.button(
                icon="person", on_click=lambda: ui.notify("User menu coming soon")
            ).props("flat color=white")

    # Left Drawer
    with ui.left_drawer(value=True, fixed=True, elevated=True).classes(
        "bg-gray-800 text-white"
    ) as left_drawer:
        with ui.column().classes("w-full p-4 gap-1"):
            ui.label("SQUATTER").classes("text-2xl font-bold text-primary mx-2 my-6")
            ui.separator().classes("bg-gray-600")

            nav_item("Event Types", "event-types", "list")

            has_active = bool(active_event_type)
            nav_item("Templates", "templates", "article", disabled=not has_active)
            nav_item(
                "Computed Fields",
                "computed-fields",
                "functions",
                disabled=not has_active,
            )
            nav_item("Filters", "filters", "filter_list", disabled=not has_active)
            ui.separator().classes("bg-gray-600")
            nav_item(
                "Generate Report",
                "generate-report",
                "play_arrow",
                disabled=not has_active,
            )

    # Main Content
    with ui.column().classes("w-full max-w-7xl mx-auto p-6") as main_content:
        pass  # Will be filled by show_page

# ==================== Root Page ====================

@ui.page("/")
def home():
    build_ui()
    show_page("event-types")  # Initial content

# ==================== Run ====================

def setup():
    refresh_event_options()

def run_app(reload: bool = False):
    setup()
    ui.run(
        title="Squatter",
        port=8080,
        reload=reload,
        show=True,
        dark=True,
    )

if __name__ in {"__main__", "__mp_main__"}:
    run_app(reload=False)

I has asked Grok about the alternative of creating a Single Page Application (SPA), instead, and the following code was suggested:

"""Squatter - Single Page App Version"""

from typing import Optional

from nicegui import ui

# Global state
active_event_type: Optional[str] = None
event_type_options: list[str] = []

# Container references
left_drawer = None
main_content = None


def refresh_event_options():
    global event_type_options
    event_type_options[:] = [
        "Squad Training",
        "Officer Safety",
        "Range Qualification",
    ]


def set_active_event(e):
    global active_event_type
    if e.value:
        active_event_type = e.value
        ui.notify(f"Active Event set to: {e.value}", type="positive")


def show_page(page_name: str):
    """Update only the content area"""
    if main_content is None:
        return
    main_content.clear()
    with main_content:
        if page_name == "event-types":
            ui.label("Event Types management coming soon...").classes("text-h5")
        elif page_name == "templates":
            ui.label(f"Templates - {active_event_type or 'No Event Selected'}").classes(
                "text-h5"
            )
        elif page_name == "computed-fields":
            ui.label(
                f"Computed Fields - {active_event_type or 'No Event Selected'}"
            ).classes("text-h5")
        elif page_name == "filters":
            ui.label(f"Filters - {active_event_type or 'No Event Selected'}").classes(
                "text-h5"
            )
        elif page_name == "generate-report":
            ui.label("Generate Report coming soon...").classes("text-h5")


def nav_item(label: str, page: str, icon: Optional[str] = None, disabled: bool = False):
    btn = (
        ui.button(
            label,
            on_click=lambda: show_page(page) if not disabled else None,
        )
        .props("flat align=left")
        .classes("w-full justify-start text-white")
    )
    if icon:
        btn.props(f"icon={icon}")
    if disabled:
        btn.classes("text-gray-500 opacity-50 pointer-events-none")
    return btn


def build_ui():
    """Build header + drawer + content container"""
    global left_drawer, main_content

    # Header
    with ui.header().classes("items-center justify-between bg-primary text-white"):
        with ui.row().classes("items-center gap-4"):
            ui.button(
                icon="menu",
                on_click=lambda: left_drawer.toggle() if left_drawer else None,
            ).props("flat round color=white")
            ui.label("Squatter Reports").classes("text-h6 font-bold")

        with ui.row().classes("items-center gap-3"):
            ui.label("Active Event:").classes("text-white")
            ui.select(
                options=event_type_options,
                value=active_event_type,
                on_change=set_active_event,
                label="Select event type",
            ).classes("min-w-72").props("outlined dense")

            ui.button(
                "Manage Events",
                on_click=lambda: show_page("event-types"),
                icon="settings",
            ).props("flat color=white")

        with ui.row().classes("items-center gap-2"):
            ui.button(
                icon="person", on_click=lambda: ui.notify("User menu coming soon")
            ).props("flat color=white")

    # Left Drawer
    with ui.left_drawer(value=True, fixed=True, elevated=True).classes(
        "bg-gray-800 text-white"
    ) as left_drawer:
        with ui.column().classes("w-full p-4 gap-1"):
            ui.label("SQUATTER").classes("text-2xl font-bold text-primary mx-2 my-6")
            ui.separator().classes("bg-gray-600")

            nav_item("Event Types", "event-types", "list")

            has_active = bool(active_event_type)
            nav_item("Templates", "templates", "article", disabled=not has_active)
            nav_item(
                "Computed Fields",
                "computed-fields",
                "functions",
                disabled=not has_active,
            )
            nav_item("Filters", "filters", "filter_list", disabled=not has_active)
            ui.separator().classes("bg-gray-600")
            nav_item(
                "Generate Report",
                "generate-report",
                "play_arrow",
                disabled=not has_active,
            )

    # Main Content
    with ui.column().classes("w-full max-w-7xl mx-auto p-6") as main_content:
        pass  # Will be filled by show_page

# ==================== Root Page ====================

@ui.page("/")
def home():
    build_ui()
    show_page("event-types")  # Initial content

# ==================== Run ====================

def setup():
    refresh_event_options()

def run_app(reload: bool = False):
    setup()
    ui.run(
        title="Squatter",
        port=8080,
        reload=reload,
        show=True,
        dark=True,
    )

if __name__ in {"__main__", "__mp_main__"}:
    run_app(reload=False)

I’m going to try the SPA version for now. Key things to figure out are how to ensure that the drawer items’ visibility get updated, when an event type is selected. Will have to flesh out simple (event type, computed field, filters, template) and complex (template management) pages.

How-Tos for NiceGUI…

During the process of creating a user interface with NiceGUI, that would blend into the service and repository layers that exist, I had to resolve several things. Several of these were handled while implementing the UI for the EventType resource, but as I work through this, I suspect I’ll see them in other areas as well.

Using “id”

For the edit and rename commands for EventType resources, the Web UI works much better with the “id” of the entry, rather than the “name” that the CLI version used (and would call down to the service layer and then repository layer to identify the object).

To support this, first, I changed the EventInfo resource, which is a domain representation of the EventType, to also have the event ID. The list command in the service layer and is used for CLI and Web UI, is modified to:

def list_events(self) -> list[EventInfo]:
"""List all events."""
with get_db() as db:
events = EventRepository(db).get_all()
return [
EventInfo(
id=e.id,
name=e.name,
created_at=e.created_at,
num_templates=len(e.templates),
num_fields=len(e.fields),
num_filters=len(e.filters),
)
for e in events
]

With that, when the user clicks to rename or delete an entry, the ID can be used, instead of the name. This required changing the service layer methods:

def rename(self, event_id: int, new_name: str)
def delete(self, event_id: int)

And make use of the existing repository method:

event = EventRepository(db).get_by_id(event_id)

 

File Uploading

In the CLI, the user provides the filename, and then the service layer will open the file, and use the CSV library function to get the headers. For the first cut at doing this for a web interface, we’ll make use of the file upload feature, and present a dialog:

def show_create_event_type_dialog():
    """Open create dialog"""
    with ui.dialog() as dialog, ui.card().classes("w-full max-w-md"):
        ui.label("Create New Event Type").classes("text-h6")

        name_input = ui.input(
            "Event Type Name", placeholder="e.g. steel challenge"
        ).classes("w-full")

        ui.upload(
            label="Upload CSV Definition File",
            multiple=False,
            auto_upload=True,
            on_upload=lambda e: handle_uploaded_file(name_input.value, e, dialog),
        ).props("accept=.csv").classes("w-full")

        with ui.row().classes("gap-2 justify-end w-full"):
            ui.button("Cancel", on_click=dialog.close)

        dialog.open()

For the actual upload, we cheat a bit and save the upload in a temporary file, and then call the service layer to process it using the temporary file name. It was really kludgey, as there appears to be some differences in NiceGUI versions, for handling this. The code we ended up with:

@handle_ui_errors("Failed to create event type")
async def handle_uploaded_file(name: str, upload_event, dialog):
    """Called when file upload completes"""
    if not name or not name.strip():
        ui.notify("Event Type name is required", type="warning")
        return

    if not upload_event or not upload_event.file:
        ui.notify("No file received", type="warning")
        return

    service = EventService()
    config = get_config()

    file_obj = upload_event.file

    try:
        # Robust content reading for different NiceGUI versions
        if hasattr(file_obj, "content") and hasattr(file_obj.content, "read"):
            csv_content = await file_obj.content.read()
        elif hasattr(file_obj, "read"):
            csv_content = await file_obj.read()
        else:
            # Fallback
            csv_content = file_obj.content if hasattr(file_obj, "content") else file_obj

        # Ensure we have bytes
        if isinstance(csv_content, str):
            csv_content = csv_content.encode("utf-8")
        elif not isinstance(csv_content, bytes):
            csv_content = str(csv_content).encode("utf-8")

        # Temporary file for existing service layer
        import tempfile
        from pathlib import Path

        temp_path = None
        try:
            with tempfile.NamedTemporaryFile(
                delete=False, suffix=".csv", mode="wb"
            ) as tmp:
                tmp.write(csv_content)
                temp_path = tmp.name

            service.create_event(name.strip(), temp_path, config)

            ui.notify(f"Event Type '{name}' created successfully!", type="positive")
            dialog.close()
            show_page("event-types")

        finally:
            if temp_path:
                Path(temp_path).unlink(missing_ok=True)

    except Exception as e:
        logger.exception("Error processing uploaded CSV")
        ui.notify(f"Failed to process uploaded file: {e}", type="negative")

The next thing will be to try to use the normal file upload and process with CVS. For that, the file loader was changed to:

@handle_ui_errors("Failed to create event type")
async def handle_uploaded_file(name: str, upload_event, dialog):
    if not name or not name.strip():
        ui.notify("Event Type name is required", type="warning")
        return

    if not upload_event or not upload_event.file:
        ui.notify("No file received", type="warning")
        return

    service = EventService()
    config = get_config()

    file_obj = upload_event.file
    # Robust content reading for different NiceGUI versions
    if hasattr(file_obj, "content") and hasattr(file_obj.content, "read"):
        csv_content = await file_obj.content.read()
    elif hasattr(file_obj, "read"):
        csv_content = await file_obj.read()
    else:
        # Fallback
        csv_content = file_obj.content if hasattr(file_obj, "content") else file_obj

    service.create_event_from_content(name.strip(), csv_content, config)

    ui.notify(f"Event Type '{name}' created successfully!", type="positive")
    dialog.close()
    show_page("event-types")

In the service layer, a new method was created to process a stream (create_event_from_content) vs a file name (create_event) that is opened and the common logic (create_event_from_header) was extracted out:

    def create_event_from_header(self, name, config, header):
        """Create event and store filters and computed fields."""
        logger.info("Have %d CSV file fields", len(header))
        with get_transactional_db() as db:
            repo = EventRepository(db)
            if repo.get(name):
                raise ConflictError(f"Event '{name}' already exists.")
            event = repo.create(name)

            computed_fields = config.computed_fields | config.computed_fields_for(name)
            all_fields = build_fields(header) + add_computed_fields(
                header, computed_fields
            )
            repo.add_fields(event, all_fields)
            msg = (
                f"Event '{name}' created with {len(all_fields)} "
                f"fields ({len(computed_fields)} computed)."
            )
            logger.info(msg)

            config_filters = config.filters | config.filters_for(name)
            filters = load_filters_from_config(config_filters)
            report_filters = {
                k: from_field_filter(v, event.id) for k, v in filters.items()
            }
            FilterRepository(db).upsert_filters(report_filters)
            msg = f"Loaded {len(filters)} filters from config."
            logger.info(msg)

    def create_event_from_content(
        self, name: str, csv_content: bytes | str, config: Config
    ) -> None:
        """Create event from CSV content (for web UI)"""
        if isinstance(csv_content, str):
            csv_content = csv_content.encode("utf-8")
        elif not isinstance(csv_content, bytes):
            csv_content = str(csv_content).encode("utf-8")

        header = read_header_from_stream(csv_content)

        self.create_event_from_header(name, config, header)

    def create_event(self, name: str, csv_file: str, config: Config) -> None:
        """Create a new event."""
        header = read_header(csv_file)
        self.create_event_from_header(name, config, header)

Then, in the loader.py module that handled processing the CSV file, a version was added to take a stream, instead of a filename:

def read_header_from_stream(csv_content: bytes) -> Sequence[str]:
    """Read header from stream."""
    # Use StringIO to simulate file for existing read_header
    try:
        # Convert bytes to text stream for CSV reader
        text_stream = io.StringIO(csv_content.decode("utf-8"))
        reader = csv.DictReader(text_stream)
        if not reader.fieldnames:
            raise ConfigurationError("No header found in CSV file")
        return reader.fieldnames
    except Exception as exc:
        raise SourceFileReadError(f"Failed to parse CSV content: {exc}") from exc

With this change, we now will upload a file, process the stream using the CSV library, and create the EventType.

 

UI Testing

To test the UT portion of the app, we’ll test two parts. First is the handlers in the UI, like the do_rename() function that was created. The UT will use pytest, and mock the calls to the service layer. For example, for do_delete:

@handle_ui_errors("Failed to delete")
def do_delete(event_id: int, dialog):
    EventService().delete(event_id)
    ui.notify("Event type deleted", type="positive")
    dialog.close()
    show_page("event-types")

The test would be:

@patch("squatter.ui.ui_utils.ui")
@patch("squatter.ui.pages.event_types.show_page")
@patch("squatter.ui.pages.event_types.ui")
@patch("squatter.services.event_service.EventService")
def test_do_delete_calls_service(mock_service_cls, mock_ui, mock_show_page, mock_utils_ui):
    # Import INSIDE the test, after patches are active
    from squatter.ui.pages.event_types import do_delete

    mock_service = MagicMock()
    mock_service_cls.return_value = mock_service
    mock_dialog = MagicMock()

    do_delete(event_id=7, dialog=mock_dialog)

    mock_service.delete.assert_called_once_with(7)
    mock_ui.notify.assert_called_once_with("Event type deleted", type="positive")
    mock_dialog.close.assert_called_once()
    mock_show_page.assert_called_once_with("event-types")

Note that we import do_delete inside of the tests, so that the service has already been mocked. EventService had to be mocked at source (instead of use), because it is imported using “from squatter.serviecs.event_service improt EventService”.

The second part is to test the UI logic by using the NiceGUI User testing package. We install the package with “uv add nicegui[testing]”, and then can create tests. For example, with the code:

@handle_ui_errors("Failed to delete event type")
def delete_event_type(event_id: int, name: str):
    """Present dialog for deleting an event."""
    ui.dialog().props("persistent").classes("bg-red-100")  # simple confirmation
    with ui.dialog() as dialog, ui.card().classes("w-full max-w-sm"):
        ui.label(f"Delete '{name}'?").classes("text-h6 text-red-600")
        ui.label(
            "This will delete all templates, filters, and "
            "computed fields for this event type."
        ).classes("text-red-500")

        with ui.row().classes("gap-2 justify-end"):
            ui.button("Cancel", on_click=dialog.close)
            ui.button(
                "Delete", color="negative", on_click=lambda: do_delete(event_id, dialog)
            )

        dialog.open()

The test would look like:

@patch("squatter.services.event_service.EventService")
async def test_delete_dialog_appears(mock_service_cls, user: User):
    mock_service = MagicMock()
    mock_service_cls.return_value = mock_service

    @ui.page("/test-delete")
    def test_page():
        from squatter.ui.pages.event_types import delete_event_type
        ui.button("trigger", on_click=lambda: delete_event_type(event_id=1, name="MyEvent"))

    await user.open("/test-delete")
    user.find("trigger").click()  # no await

    await user.should_see("Delete 'MyEvent'?")
    await user.should_see("This will delete all templates")

Here, we needed a page context, so there is a test_page(). In conftest.py, you need to define the User fixture, and it’s safe to reregister page modules in between tests, as page routes may not get re-registered, when using submodules.

@pytest.fixture
async def user(user: User) -> User:
    return user

@pytest.fixture(autouse=True)
def clear_squatter_page_modules():
    """Force re-registration of page modules between tests."""
    modules_to_remove = [
        name for name in sys.modules
        if name == "squatter.ui.pages" or name.startswith("squatter.ui.pages.")
    ]
    for name in modules_to_remove:
        sys.modules.pop(name, None)
    yield

And in pyproject.toml, include async mode, main file, and options:

[tool.pytest.ini_options]
testpaths = ["tests"] pythonpath = ["."] python_files = ["test_*.py"] asyncio_mode = "auto" main_file = "squatter/ui/main.py" addopts = "-p nicegui.testing.user_plugin"

We also need to install the pytest-asyncio package (uv add pytest-asyncio –dev).

 

Catching Exceptions In IT

For one method in the Web UI, there was a try/catch block that would catch any error and tell the user:

try:
...
except Exception as e:
    logger.error("Failed to load events: %s", e)
    ui.notify("Failed to load event types", type="negative")

To test this, I created a test to cause an exception in the service layer, and then checked that the caplog would see the message. However, I was getting a test teardown error, even though the test was working. It turns out that with the User package, if there are error log messages during a test run, they are considered a failure. The solution is to clear the captured log:

@patch("squatter.services.event_service.EventService")
async def test_load_event_types_table_failure(mock_service_cls, user: User, caplog):
    mock_service = MagicMock()
    mock_service.list_events.side_effect = Exception("Mock database error")
    mock_service_cls.return_value = mock_service

    table_ref = {}

    @ui.page("/test-table-failure")  # use a unique path
    def test_page():
        from squatter.ui.pages.event_types import load_event_types_table
        table = ui.table(columns=[], rows=[], row_key="name")
        table_ref["table"] = table
        load_event_types_table(table)

    with caplog.at_level(logging.ERROR):
        await user.open("/test-table-failure")

    assert not table_ref["table"].rows
    assert "Failed to load events: Mock database error" in caplog.text

    # Clear the log so NiceGUI's teardown check doesn't see the ERROR entry
    caplog.clear()

Note that an unique endpoint name was used. These endpoints persist between tests, so we don’t want any issues with using the same endpoint.

Graceful Exiting

With the app running (running the web server), to exit, you can press Control-C. However, you’ll get a traceback with the KeyboardInterrupt exception. One simple change can be made to gracefully handle this:

def run_app(reload: bool = False):
    setup()

    try:
        ui.run(
            title="Squatter",
            port=8080,
            reload=reload,
            show=True,
            dark=True,
            storage_secret=os.getenv(
                "SESSION_SECRET", "local-dev-key"
            ),  # must match SessionMiddleware
        )
    except KeyboardInterrupt:
        logger.info("Shutdown requested by user.")
    except Exception as e:
        logger.error("Unexpected error: %s", e)

Coverage Testing

When running coverage tests, I was seeing tons of these messages at the end of the run:

/Users/pcm/workspace/kubernetes/squatter/.venv/lib/python3.13/site-packages/_pytest/unraisableexception.py:33: ResourceWarning: unclosed database in <sqlite3.Connection object at 0x10ae2dc60>
gc.collect()
ResourceWarning: Enable tracemalloc to get the object allocation traceback

ClaudeAI mentioned that I need to delete the SQLite connection before the garbage collector runs. Can do that with:

@pytest.fixture(autouse=True)
def setup_test_db():
    init_engine("sqlite:///:memory:")
    init_schema()
    yield

    from squatter.database import get_engine
    try:
engine = get_engine()
engine.dispose(close=True)
except RuntimeError:
pass # already reset by the test
reset_engine()

I was still seeing one ResourceWarning (not tons), and could not get a fixture to resolve it, so will ignore the single warning.

For some tests, the ui module has to be mocked. In those cases, a fixture can be created to mock the package:

@pytest.fixture(autouse=True)
def mock_ui():
    # Target the module attribute directly to override the cache
    with patch("squatter.ui.pages.fields.ui") as mock:
        yield mock

There are some cases, where we want to mock some functions of a class, but not the one(s) under test. In this case, it involved two things. First, the class under test is instantiated INSIDE the test case, so that it is imported after patches have been applied. Second, the patch.obect function can be used to perform the mocking. For example:

def test_do_save(mock_ui):
"""Test saving changes."""
from squatter.ui.pages.fields import FieldManager

mock_compute_service = MagicMock()
mock_field_service = MagicMock()
the_field = MagicMock()
the_field.id = 1
the_field.display_name = "DisplayName1"

manager = FieldManager(
"MyEvent",
field_service=mock_field_service,
computed_service=mock_compute_service,
)
manager._active_field = the_field

with (
patch.object(manager, "_clear_editor") as mock_clear,
patch.object(manager.state, "reload") as mock_reload,
patch.object(manager, "validate", return_value=True),

):
manager.do_save()
mock_field_service.save.assert_called_once_with("MyEvent", the_field)
mock_ui.notify.assert_called_once_with(
"Updated field 'DisplayName1'.", type="positive"
)
mock_reload.assert_called_once()
mock_clear.assert_called_once()

Integration Testing

There are some cases where we want to test that the basic page layout is correct. One way is to use nicegui.testing.user package. Here is one example that I created…

from unittest.mock import MagicMock, patch

from nicegui import ui
from nicegui.testing import User

from squatter.ui.pages.template_editor import TemplateEditor
from tests.ui.utils import make_mock_fields


async def test_refresh_list(user: User):
    """Test the display of the selected lists with entries."""
    all_fields = make_mock_fields(3)
    for i in range(3):
        all_fields[i].position = i + 1
    mock_service = MagicMock()
    mock_service.get_template_fields.return_value = (123, all_fields)

    @ui.page("/test-template-editor-lists")
    def test_page():
        editor = TemplateEditor("MyTemplate", "MyEvent", mock_service)
        ui.button(
            "trigger",
            on_click=lambda: editor.refresh_lists(),
        )

    await user.open("/test-template-editor-lists")
    user.find("trigger").click()  # no await

    await user.should_see("DisplayName1")
    await user.should_see("DisplayName2")
    await user.should_see("DisplayName3")
    await user.should_see("auto  default")
    # Available list is empty...
    await user.should_see("(none)")

 Essentially, we create a test page that has a button that will invoke a method that will perform ui commands for our application. Then, we open the page and simulate a click. From there, we can check to see that elements are created, as a result. We can use this syntax, for different types of elements (with a button, for example):

await user.should_see(content="Save", kind=ui.button)

The function being tested performas various UI calls. I had some code, where the function was adding elements to a top level container. In that case, I had the test page create the container. For example:

    @ui.page("/test-template-editor")
    def test_page():
        import squatter.ui.pages.template_editor as te_module

        set_active_event_name("MyEvent")
        set_event_options(["MyEvent", "YourEvent"])
        build_ui()

        ui.button(
            "trigger",
            on_click=lambda: te_module.show_template_editor("MyTemplate", "MyEvent"),
        )

    await user.open("/test-template-editor")
    user.find("trigger").click()

Here, the build_ui function creates the containers that the page is expecting to use to add elements. Note that I use a different URI for every test function, as these may persist across test cases, and we don’t want any conflicts.

Classes and Styles

Columns

One thing that I didn’t have much understanding of, was all of the NiceGUI classes and styles and how to use them to create lists and other elements in the code. I queried AIs to gain some insight into commonly used patterns.

For columns of data, you want to first create a ui.row() and then inside of it, create ui.column() clauses for each column. For example:

            with (
                ui.row().classes("w-full flex-nowrap gap-4").style("overflow: hidden;")
            ):
                # Column 1: Templates available
                with ui.column().classes("gap-2").style("flex: 1; min-width: 0;"):
                    ui.label("Available Templates (select one)").classes("text-bold")
                    self.template_list_container = (
                        ui.column()
                        .classes("w-full border rounded overflow-y-auto gap-0")
                        .style("height: 480px;")
                    )
                # Column 2: Filters available
                with ui.column().classes("gap-2").style("flex: 1; min-width: 0;"):
                    ui.label("Available Filters (multiple allowed)").classes(
                        "text-bold"
                    )
                    self.filters_list_container = (
                        ui.column()
                        .classes("w-full border rounded overflow-y-auto gap-0")
                        .style("height: 480px;")
                    )
                # Column 3: Input and output selections
                with ui.column().classes("gap-2").style("flex: 1; min-width: 0;"):
                    ui.label("Input/Output Selections").classes("text-bold")
                    ui.upload(
                        label="Upload event data CSV file",
                        multiple=False,
                        auto_upload=True,
                        on_upload=lambda e: self._handle_uploaded_file(e),
                    ).props("accept=.csv").classes("w-full")
                    self._layout_select = ui.select(
                        label="Output Format",
                        options=["Rich", "CSV", "Excel"],
                        value="Rich",
                        # on_change=self._on_field_type_change,
                    ).classes("w-full")

                    ui.button("Generate", on_click=self.validate).props(
                        "color=positive"
                    )
                    ui.button("Clear", on_click=self.do_clear).props(
                        "flat color=secondary"
                    )

Lists

For lists, I created a ui.column() “container”, like the self.filters_list_container and self.template_list_container, above. In these cases, I wanted a fixed height list (style “height: 480px;”), a border (classes “border rounded”), and I wanted a scroll bar, if there are more elements (class “overflow-y-auto”). Then, at the end of the method that generates the layout, I call a refresh method that will clear and then populate the list. For example, here is a list that allows multiple selections:

    def refresh_filters_list(self):
"""Populate list of filters."""
self.filters_list_container.clear()
self.render_filters()

def render_filters(self):
"""Display list of filters."""
with self.filters_list_container:
if not self.filters:
ui.label("(no filters)").classes("text-grey-5 q-pa-sm")
return
for t in self.filters:
self._render_filter_row(t)

def _render_filter_row(self, f: FilterInfo) -> None:
"""Render a single filter row."""
is_hi = f in self._highlighted_filters
border_style = "border-bottom: 1px solid #e0e0e0;"
row_style = (
"background-color: #1565C0; color: white;" if is_hi else ""
) + border_style
with (
ui.row()
.classes(
"w-full items-center q-px-sm q-py-xs cursor-pointer "
+ ("" if is_hi else "hover:bg-grey-2")
)
.style(row_style)
.on("click", lambda _, e=f: self.toggle_highlighted_filter(e))
):
ui.label(f.name).classes("text-body2")

def toggle_highlighted_filter(self, f: FilterInfo) -> None:
"""Change selection of filter."""
if f in self._highlighted_filters:
self._highlighted_filters.remove(f)
else:
self._highlighted_filters.append(f)
self.refresh_filters_list()

The refresh menthod can be called when there is a change to the list, too. In the above, when the user clicks on an entry, it will add/remove the entry from a list of _highlighted_filters. When the row is displayed, if it is in the highlighted list, the background will be changed.

Another important point is that, for my lists, I also wanted each entry to be separated by a line. Initially, I tried using the classes “divide-y divide-grey-3” on the column, but would end up with no line on the last item, if there were fewer items than the height of the list. to fix this, you can see that instead of adding those to the ui.column(), on the row I added the style “border-bottom: 1px solid #e0e0e0;”. This causes a line after every row.

Another issue I saw was that, but default there is some spacing around the elements. This causes an odd view, when multiple adjacent elements are highlighted, as there is a gap between rows. To solve that, on the ui.column(), I added the class “gap-0”, so that there is no extra padding between rows, and the highlighting of adjacent items would be completely filled in.

If the rows themselves are too compressed, instead of having a ui.row() class of “q-py-none”, you can use “q-py-xs”, like I did, so things are not so crammed.

 

Columns with buttons

For one of the pages, I had three columns. The outer two were lists, and the middle had several vertical buttons. The issue I had, was that the buttons were at the top of the column, right in line for the title text for the lists.  A better solution is to center those buttons vertically. To do that, I used two mechanisms. On the enclosing row for the columns, I added “items-stretch”, so that all columns will be the same height (with the lists being the tallest). The other is to add “justify-center” on the middle column so that the buttons are in the middle (vertically). An example: 

                    with (
                        ui.row()
                        .classes("w-full flex-nowrap gap-2 items-stretch")
                        .style("overflow: hidden; min-width: 0;")
                    ):
                        # Sub-column 1: Available fields
                        with (
                            ui.column().classes("gap-1").style("flex: 1; min-width: 0;")
                        ):
                            ui.label("Available").classes("text-caption text-bold")
                            self.available_container = (
                                ui.column()
                                .classes("w-full border rounded overflow-y-auto gap-0")
                                .style("height: 480px;")
                            )

                        # Sub-column 2: Transfer controls
                        with (
                            ui.column()
                            .classes("items-center justify-center gap-2")
                            .style("width: 80px; flex-shrink: 0;")
                        ):
                            ui.button("→", on_click=self.select_from_available).props(
                                "outline dense color=primary"
                            ).classes("w-full")
                            ui.button("←", on_click=self.remove_selected_field).props(
                                "outline dense color=secondary"
                            ).classes("w-full")
                            ui.button("Save", on_click=self.do_save).props(
                                "color=positive"
                            )
                            ui.button("Cancel", on_click=self.do_cancel).props(
                                "flat color=negative"
                            )

                        # Sub-column 3: Selected fields
                        with (
                            ui.column().classes("gap-1").style("flex: 1; min-width: 0;")
                        ):
                            ui.label("Selected").classes("text-caption text-bold")
                            self.selected_container = (
                                ui.column()
                                .classes("w-full border rounded overflow-y-auto gap-0")
                                .style("height: 480px;")
                            )

Elements, Class, and Style Definitions

I had asked ClaudeAI for an explanation of the various classes and styles. Here’s what I got back:

What the layout classes and styles actually mean

ui.row() vs ui.column()

These map directly to CSS flexbox:

NiceGUI

CSS equivalent

Children flow

ui.row()

display: flex; flex-direction: row

Left → right

ui.column()

display: flex; flex-direction: column

Top → bottom

This is your root problem: your three “sub-columns” were inside a ui.column(), so they stacked vertically. Switching to ui.row() makes them sit side by side.

 

Tailwind/Quasar class breakdown

Sizing and width

  • w-full — width: 100%. Makes the element fill its parent’s width.
  • min-width: 0 — Critical in flex layouts. By default, flex items won’t shrink below their content’s natural width. Setting this to 0 allows them to shrink properly and prevents overflow.

Flex behavior

  • flex-nowrap — Prevents flex children from wrapping to a new line. Without this, items that don’t fit will wrap downward instead of staying side by side.
  • flex-grow — Tells the element to expand and fill available space in the flex container.
  • flex-shrink: 0 — Prevents the element from shrinking (used on the fixed-width controls column so it stays 80px).
  • flex: 1 / flex: 3 — Shorthand for flex-grow: 1/3; flex-shrink: 1; flex-basis: 0. The numbers are ratios — so flex: 3 gets three times the space of flex: 1. Your form column gets 1 part, the selector area gets 3 parts.

Spacing

  • gap-2 / gap-4 — Adds uniform spacing between children (0.5rem and 1rem respectively). Cleaner than adding mb-2 to every child individually.
  • mb-2, mt-2 — Margin bottom/top. Fine for one-offs but gap-* on the parent is usually tidier.

Alignment

  • items-center — Aligns children on the cross axis (vertically in a row, horizontally in a column).
  • justify-center — Aligns children on the main axis (horizontally in a row, vertically in a column).
  • justify-between — Spreads children with space between them (used in your header to push label left and any action right).

Typography

  • text-h6 — Quasar heading size (maps to Material Design h6).
  • text-caption — Smaller caption text.
  • text-bold — Bold weight. (Your original used font-bold which is Tailwind; text-bold is the Quasar equivalent — either may work depending on your setup.)
  • text-grey-7 — Quasar’s grey palette, level 7 (medium-dark grey).

Overflow and scrolling

  • overflow: hidden on the outer row — Prevents the row itself from overflowing the page when content is large.
  • overflow-y-auto on the list containers — Adds a vertical scrollbar only when content exceeds the container height. Combined with the fixed height: 480px, this makes the lists scrollable boxes.

Visual

  • border — Adds a 1px border (Tailwind default: 1px solid currentColor).
  • rounded — Applies border-radius for slightly rounded corners.

 

Addition change to center the middle control column vertically…

 

  • items-stretch on the ui.row() — makes all children grow to the same height as the tallest sibling (the lists). Without this, each column only takes its natural content height.
  • justify-center was already on your controls column — once the column is stretched to full height, this vertically centers its buttons within that space.

Adding Alembic Support

I modified the filter condition (an enum) to add two more conditions. Fortunately, in my Sqlite database, this field is defined as VARCHAR(11), of which the size is really ignored. Adding new conditions just work, and SQLAlchemy’s definition as an enum, with native_enum=False, causes it to create strings with no value checking.

I should just change this to a VARCHAR(32) as in:

condition: Mapped[str] = mapped_column(String(32))

However, I’m likely to make more changes, and I will transition to PostgreSQL later, so I want to have support for migration of databases, and Alembic is something I’ve used before. I’m rusty on what needs to be done, so I asked ChatGPT this time for some guidance, with the assumption that the current database will be the baseline, and that the above change will be my first migration.

Install Needed Support

Since I want to convert to PostgreSQL later, I’ll install support for it now, and will initialize Alembic with:

uv add alembic psycopg[binary]
uv run alembic init alembic

Point to modules

We want alembic/env.py to point to our models, so I changed the “target_metadata” line to the following:

from squatter.models import Base

target_metadata = Base.metadata

For the “config” line in the file, I added the following to refer to the DATABASE_URL environment variable that I already have for my app.

import os

database_url = os.getenv("DATABASE_URL")
if database_url:
config.set_main_option("sqlalchemy.url", database_url)

In alembic.ini, I set the default database to point to my current database, by changing the line:

sqlalchemy.url = sqlite:///templates.db

Note: I set the DATABASE_URL environment variable in the start of CLI and UI code. Will likely set env variable, when startup container later, but to make it simplier, I created a helper function in setup.py() that returned the DATABASE_URL environment variable or set it to the “sqlite:///templates.db” as a default. Then, in env.py (and in the code) I imported get_database_url(), and then called that method in set_main_option().

Note: The unit tests call create_engine() directly via conftest.py to force in-memory SQlite database use, which is fine.

I added the alembic directory, and alembic.ini to version control, so changes are tracked. I made sure that commits were done along the way.

Let Alembic manage schema

From both the CLI and UI entry points, I removed the call to init_schema(), which is where the schema is created “Base.metadata.create_all(bind=get_engine())”. From now on, the app will ensure the schema is there, and Alembic is used to keep the schema up-to-date. You can use “alembic upgrade head” to make updates to the schema, before running the app.

Create A Baseline Migration

Since the database already exists, we need to make sure Alembic treats the current schema as the baseline. Use the following command to build the files needed, and stamp the existing database (creates alembic_version table).

uv run alembic revision -m "initial schema"
uv run alembic stamp head
uv run alembic current

This should show the version number of head. You can also run sqlite3 on the database and check that there is a alembic_version table with the same version as contents. You can run the app, to verify that the database is being accessed correctly.

Now, we can do our first migration. I will change an enum, which is referenced in the database as a VARCHAR, currently 11 characters in length (although SQLite allows any length), to have a (larger) fixed length. I changed the model from this:

    condition: Mapped[FilterCondition] = mapped_column(
Enum(FilterCondition, native_enum=False)
)

to…

    condition: Mapped[FilterCondition] = mapped_column(
Enum(FilterCondition, native_enum=False, length=32)
)

Now, a new migration can be created with the Alembic command:

uv run alembic revision --autogenerate -m "expand filter condition length"
...
INFO [alembic.autogenerate.compare.types] Detected type change from VARCHAR(length=11) to Enum('STARTS_WITH', 'EQUAL', 'NOT_EQUAL', 'INT_LESS_THAN', 'INT_GREATER_THAN', name='filtercondition', native_enum=False, length=32) on 'report_filter.condition'

This created a new file in alembic/versions/. You should check the code in the file to make sure it looks correct (and make any changes, if needed, for the migration. In my case, it created upgrade and downgrade functions:

 def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('report_filter', schema=None) as batch_op:
batch_op.alter_column('condition',
existing_type=sa.VARCHAR(length=11),
type_=sa.Enum('STARTS_WITH', 'EQUAL', 'NOT_EQUAL', 'INT_LESS_THAN', 'INT_GREATER_THAN', name='filtercondition', native_enum=False, length=32),
existing_nullable=False)

# ### end Alembic commands ###


def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('report_filter', schema=None) as batch_op:
batch_op.alter_column('condition',
existing_type=sa.Enum('STARTS_WITH', 'EQUAL', 'NOT_EQUAL', 'INT_LESS_THAN', 'INT_GREATER_THAN', name='filtercondition', native_enum=False, length=32),
type_=sa.VARCHAR(length=11),
existing_nullable=False)

# ### end Alembic commands ###

Note: It used batch_alter_table(), because I set render_as_batch=True for SQLite.

Next, you can (optionally) check to see what Alembic will do, and then run the migration. The changes can be checked in the database, after the migration and you can check that the new version is stored in the alembic_version table:

uv run alembic upgrade head --sql
uv run alembic upgrade head

sqlite3 templates.db
.schema report_filter

...
condition VARCHAR(32) NOT NULL,

SELECT * FROM alembic_version;
189b1b7b43d8

.quit

Now, you can run unit tests to ensure they pass, and do live tests to make sure that the filters are there, the new enum values are present, and no data was lost.

Future Migration Process

  • Invoke: uv run alembic current
  • Modify the model(s)
  • Invoke: uv run alembic revision –autogenerate -m “describe change
  • Review the created migration file and adapt, as needed.
  • Invoke: uv run alembic upgrade head
  • Invoke: uv run alembic current
  • Run all tests, and perform live testing of the app.
  • Commit the model changes, and the new migration file.

 

 

 

 

 

TODO:

  • Add Alembic support so database changes are easier, and can migrate to PostgreSQL later.
  • HTML rendering for web.
  • Port to Kubernetes container, use a PostgreSQL container for database, and make accessible via domain name.
  • User login, and Group/Club association.
  • Text report that lists staff per squad (optionally with filter names, e.g. RO, Scorer, NROI).
  • Place code on GitHub and link to latest version.

References

Category: Linux, S/W Development | Comments Off on Don’t be NiceGUI
December 11

Updating Kubernetes nodes’ OS

With nine nodes in my cluster right now, each running Ubuntu 24.04, I want to ensure that the latest updates are present on the nodes.

I know I can remove the node from the cluster, update the OS, and then re-add the node, but I’m hoping there is an easier way.

I asked ChatGPT, and the two best methods suggested were to create a custom Ansible playbook to do the updates, or to use the Kubernetes Cluster API. The Cluster API would take a lot of effort to setup, so I’m opting for the playbook approach.

The steps suggested are:

  • cordon the node

  • drain the node

  • apply apt updates

  • reboot

  • wait for node to be ready

  • uncordon

ChatGPT provided an example playbook with these steps. For my cluster, however, which uses Longhorn storage, I want to change the node drain policy before the updates are done, so that the drain command doesn’t timeout, waiting for any sole replica. After the upgrade, the drain mode can be restore.

The revised playbook (rolling_apt_upgrade.yaml) looks like this:

---
- hosts: kube_node
serial: 1
become: yes

pre_tasks:
- name: "Set Longhorn node-drain-policy BEFORE rolling updates"
command: >
kubectl -n longhorn-system patch setting node-drain-policy
--type=merge -p '{"value":"block-for-eviction-if-contains-last-replica"}'
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true

tasks:
- name: Cordon the node
command: kubectl cordon {{ inventory_hostname }}
delegate_to: "{{ groups['kube_control_plane'][0] }}"

- name: Drain the node
command: >
kubectl drain {{ inventory_hostname }}
--ignore-daemonsets
--delete-emptydir-data
--grace-period=30
delegate_to: "{{ groups['kube_control_plane'][0] }}"

- name: Apply apt upgrades
apt:
upgrade: dist
update_cache: yes

- name: Reboot the node
reboot:

- name: Wait for node to return to Ready
command: kubectl get node {{ inventory_hostname }} -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
register: node_ready
retries: 40
delay: 10
until: node_ready.stdout == "True"
delegate_to: "{{ groups['kube_control_plane'][0] }}"

- name: Uncordon the node
command: kubectl uncordon {{ inventory_hostname }}
delegate_to: "{{ groups['kube_control_plane'][0] }}"

post_tasks:
- name: "Restore Longhorn node-drain-policy AFTER rolling updates"
command: >
kubectl -n longhorn-system patch setting node-drain-policy
--type=merge -p '{"value":"block-if-contains-last-replica"}'
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true

From my ~/workspace/picluster area, with the playbook in the sub-dir playbooks, I invoked with:

ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/rolling_apt_upgrade.yaml

I was having issues on one node, where it was not becoming ready. What I saw was that the node did not know the IP of the API (lb-apiserver.kubernetes.local), and to resolve I had to add an entry to /etc/hosts mapping the IP to that name. I guess the problem was that, on reboot, kubelet is not up, so it cannot get the DNS info for the API. I don’t have a separate DNS server.

I added an ansible playbook to do this in playbooks/update_host_tmpl.yaml and it can be run with –limit to specify the node, if desired. Adding this to the node prep steps in Part IV of my series on Raspberry PI clusters.

 

 

 

Category: bare-metal, Kubernetes, Linux, Raspberry PI | Comments Off on Updating Kubernetes nodes’ OS
February 19

Lazyjack – Provisioning bare-metal for IPv6 Kubernetes

v1.4

I’ve been experimenting with IPv6, Kubernetes, and Istio using Docker-In-Docker. One difficulty I’ve been having is accessing the cluster externally, as the whole cluster is running in docker containers on one VM.

I decided to try to get Kubernetes running on multiple bare-metal nodes. Well, this turned out to be quite challenging, as there are many configuration settings and tweaks needed to make this work.

Not wanting to have to endure that agony, each time I set things up, or spend hours with others’ who want to do the same thing, I decided to write a small Go app to automate this setup. Lazyjack is the culmination of that effort.

You can find details on how to set up and use Lazyjack from the Github repo, but I’ll run through the steps here, using a two system setup I have in a lab.

 

Step 1: Get Everything Needed

Hardware: I already had two Ubuntu 16.04 systems, each with a pair of interfaces, one for SSH access to the box for provisioning, and one connected to an L2 switch, which would be used for the “management” network for Kubernetes. This second interface was new, and didn’t have any configuration on it.

Both boxes have access to the Internet (V4, using NAT in the lab), so that I can access repos and pull down stuff.

Update: If you want to be able to access remote IPv6 sites, without doing NAT64 (and using their IPv4 address), enable IPv6 and forwarding on each node, with an IPv6 address on the main interface. If using SLAAC, ensure system_ra=2 for the main interface, using sysctl.

Software: Being development systems, docker 17.03.2-ce and Go 1.9.2 were installed. I think these systems already had openssl installed. Likewise, Kubernetes was installed (sudo apt-get install kubernetes kubelet kubeadm) on these systems.

Update: You should install CNI v0.7.1+ on the systems, otherwise, there may be issues with IPv6 support (e.g. ip6tables configuration).

Lazyjack: The easiest way is to download the latest release, untar, and place the executable in your system path on each system.  For example, for the first release:

mkdir ~/bare-metal
cd ~/bare-metal
wget https://github.com/pmichali/lazyjack/releases/download/v1.0.0/lazyjack_1.0.0_linux_amd64.tar.gz
tar -xzf lazyjack_1.0.0_linux_amd64.tar.gz
sudo cp lazyjack /usr/local/bin

 

Note: The tar file name may be different, based on the version of lazyjack you use.

Alternately, you can get the repo:

go get github.com/pmichali/lazyjack

build it:

cd ~/go/src/github.com/pmichali/lazyjack
go build cmd/lazyjack.go

 

And then move the executable to your system path on each system. The sample-config.yaml can be used as a template for the configuration.

 

Step 2: Create a Configuration File

I’m lazy, on the system I was going to use as the master node, I just took the sample-config.yaml, and renamed it config.yaml. That file has the following network definitions already set up:

Management network –  fd00:20::/64

Support network – fd00:10::/64

Pod network – fd00:40:0:0:X/80

Service network – fd00:30::/110

DNS64 network –  fd00:64:ff9b::/96

The only thing I needed to do was identify the hostnames I was using, and the interface name for the interface that would be used for the management network. The definitions I used were:

topology:
    bxb-c2-77:
        interface: "enp10s0"
        opmodes: "master dns64 nat64"
        id: 2
    bxb-c2-79:
        interface: "enp10s0"
        opmodes: "minion"
        id: 3
support_net:

 

As you can see, bxb-c2-77 will be the master node, and it will have dns64 and nat64 containers running on it, to support IPv6 on the cluster. The sole minion is bxb-c2-79, but you can clearly more nodes listed here. Likewise, you can use a separate node for the dns64 and nat64 services.

Each node has a unique (and arbitrary), ID from 2-65535 (but why use huge numbers?).

Update: You can configure DNS64 to allow use of IPv6 addresses, so that we can directly access external sites that support IPv6:

dns64:
    allow_ipv6_use: true

 

With that, we are ready to get things rolling…

 

Step 3: Initialize For Kubernetes

On the master (bxb-c2-77 in my case), run lazyjack (I’m assuming it is in your path) with the init command (from the area where the config.yaml file is, so that you don’t have to specify the location):

sudo lazyjack init

 

Yes, you need to run all lazyjack commands as root, because privileged access is needed to various resources. If you don’t run as root, you’ll see a permission denied error.

If you are curious as to what it does, you can add the “-v 4” option, before the “init” argument.

This command will create needed certificates and keys needed for Kubernetes, and will place information into the configuration file (config.yaml), with a .bak preserving the previous version (multiple runs of this command will overwrite that, BTW). Also, the file will be, obviously, owned by root, but the permission changed to 0777, so that you can edit the file, if needed later.

You must copy the configuration file to all other nodes, now that it has the updated information.

 

Step 4: Prepare the Systems

Running lazyjack with the “prepare” command, will get a system ready for running Kubernetes. Run this command on each node.

Note: this command will generate a kubeadm.conf file in the work area (default /tmp/lazyjack) of the master node. If desired, you can customize this file to specify different settings desired for the cluster. For example, you can change the kubernetesVersion line, to pick a different version than 1.9.0 that was generated.

 

Step 5: Cluster Bring-up – Master First

On the master, run lazyjack with the “up” command. This will take a few minutes, as it starts up KubeAdm. Once completed, you can setup kubectl by doing:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

 

On subsequent runs, I usually do a “rm -rf ~/.kube”, prior to these commands.

Now, you can run “kubectl get nodes -o wide” to see that this node is up, and “kubectl get pods –all-namespaces -o wide”, to see when Kubernetes is fully up. You’ll see something like this:

NAMESPACE   NAME                              READY  STATUS   RESTARTS AGE IP                NODE
kube-system etcd-bxb-c2-77                    1/1    Running  0        2m  fd00:20::2        bxb-c2-77
kube-system kube-apiserver-bxb-c2-77          1/1    Running  0        2m  fd00:20::2        bxb-c2-77
kube-system kube-controller-manager-bxb-c2-77 1/1    Running  0        2m  fd00:20::2        bxb-c2-77
kube-system kube-dns-dcf744547-k56t2          3/3    Running  0        3m  fd00:40::2:0:0:29 bxb-c2-77
kube-system kube-proxy-m9z9m                  1/1    Running  0        3m  fd00:20::2        bxb-c2-77
kube-system kube-scheduler-bxb-c2-77          1/1    Running  0        2m  fd00:20::2        bxb-c2-77

 

You can untaint the master, if you want to be able to create pods on that node.

 

Step 6: Cluster Bring-up – Minions

After you are sure that the master is completely up (all pods and services running), go onto each of the minion nodes, and run the same “up” command. The command should complete quickly, and you can check the status of the node, using the “kubectl get nodes” command on the master. It does take a bit for the minions to become ready. Likewise, you can use the “kubectl get pod” output to see that a proxy is running for each minion.

Note: The reason we don’t do all of the steps on one node, is because lazyjack will setup static routes to other nodes, and the interfaces must be set up on those systems first.

 

Step 7: Enjoy!

That’s it. You can now play with Kubernetes, creating pods that will have IPv6 addresses, and who should be able to ping6 to other pods on other nodes and have external access to the Internet.

 

Step 8: Cleanup

You can run the “down” and then “clean” commands on each minon, and then the master to clean things up.

 

Troubleshooting

Problems Bringing Up a Minion

If the “up” command on a minion fails, you can retry it with “-v 4” to see verbose output. Then, you can manually perform some of the steps that are shown. In one case, I had kubeadm join failing and when running manually, I saw:

c2@bxb-c2-78:~/bare-metal$ sudo kubeadm join --token ...
[preflight] Running pre-flight checks.
 [WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Some fatal errors occurred:
 [ERROR Port-10250]: Port 10250 is in use

 

This occurs when the kubelet service is already running and using that port.  You can stop the service, and then do the “lazyjack up” command or, just run the “down” and then “up” command and that should reload the daemon, and restart the service.

 

 

Category: bare-metal, Go, Istio, Kubernetes, Linux | Comments Off on Lazyjack – Provisioning bare-metal for IPv6 Kubernetes