Martin Kersner

gh-zen

June 19, 2026

For a long time I spent most of my days in the editor, solving problems and building features end to end. The work came from all over, Slack, GitHub, a conversation, but the scattered sources never bothered me. I owned each problem from start to finish, so keeping all the issues and requests in my head was cheap. It changed when I moved to agentic development. Suddenly, I had to make sure there was enough work for agents and also make sure they keep doing what I need them to do. I have switched to a File, Dispatch, Finish loop which allowed me to scale the agent management, and it introduced GitHub as a centralization point where I keep all bugs, features and research issues together with the codebase. I started to visit the GitHub website very frequently and the flow felt very unnatural and slow so I decided to build gh-zen, my own GitHub TUI tool that allows me to stay close to the agents while understanding what is going on.
Read more

File, Dispatch, Finish

June 13, 2026

My approach to coding after graduating from university has not changed much up until early 2024, when I seriously started integrating AI in my daily workflows, but it seems that the most significant changes are still yet to come. At first, when experimenting with GitHub Copilot, I used it only for basic autocomplete of code in the closest scope. Then in 2025, I moved to Cursor which offered much better autocomplete capabilities. It worked throughout the whole file and multiple files as well. I could move faster and with very surgical precision. I tried to use the chat interface as well, but its capabilities were quite limited. It was useful mainly for debugging of known problematic parts of code, or for writing small-scope functions, and the experience was basically on-par with just using the free version of ChatGPT. All this time, the approaches of software engineers were more or less the same, because the tools were designed with a very specific use in mind.
Read more

Sandboxing and bypassing permissions

May 26, 2026

AI-written code quality has reached the level of production codebases, and AI agents often follow engineering practices comparable to experienced human engineers. There are still, however, many engineers who do not take full advantage of these advances in the field. They might be limited by the processes set up in their companies, belief in their own superiority or just pure fear, but more often than not, engineers have transformed into approval machines: they set a task, then mindlessly keep approving the agent's requests (running scripts, accessing the web, editing files). How do I know that? I was one of them. It worked well, I had a good understanding of what was happening at every moment, my productivity increased because agents are faster than humans in most tasks, but unfortunately it led to an artificial speed bump for the whole process.
Read more

Closing the logs-to-issues loop with AI agents

May 12, 2026

After 5 months of building with agents, coding has become the easy part. The harder part is letting agents run the full cycle, including reacting to what happens in production. The piece that's usually missing is a path from production back to the codebase the agent is editing. New features and changes to the code can lead to unexpected behavior or bugs that should be fixed fast. The setup below pushes logs into Grafana Cloud Loki, then gives the agent a read-only token plus logcli. Every N minutes the agent queries Loki for errors and opens (or comments on) GitHub issues against the repo. Both Loki and GitHub are managed services, so there's nothing on my side to keep running.
Read more

Secure read-only DB access for AI agents

May 11, 2026

Cursor running Claude Opus 4.6 wiped a production database in 9 seconds. I keep Claude on a read-only Postgres role, with the password in a service file and an allowlist that pins the connection target. The risk I care about, for trading and market data (public numbers, no user records), is destructive writes and runaway queries.
Read more

Kill the bits and gain the speed?

November 28, 2019

Recently, Facebook AI Research in collaboration with University of Rennes released paper “And the Bit Goes Down: Revisiting the Quantization of Neural Networks” which was accepted to ICLR 2020. The authors proposed a method of weight quantization for ResNet-like architectures using Product Quantization. Unlike many other papers, the error caused by codewords was not minimized directly. The training method aims to minimize the reconstruction error of fully-connected and convolutional layer activations using weighted k-means. Quantization was applied to all 3x3 and 1x1 kernel sizes except for the first convolutional layer. The paper emphasizes the importance of optimizing on in-domain input data in both quantizing and fine-tuning stages. Using their technique, weights in ResNet 50 can be compressed with a 20x factor while maintaining competitive accuracy (76.1 %). The potential impact of byte-aligned codebooks on efficient inference on CPU was briefly mentioned, but no actual method was presented. We propose and explore one possible way of exploiting frequent redundant codewords across input channels in order to accelerate inference on mobile devices.
Read more

Convolutional network without multiplication operation

March 14, 2019

In September 2018, Google Research team released paper with the title “No Multiplication? No floating point? No problem? Training networks for efficient inference” which we will refer to as NMNF. The main building blocks of convolutional neural networks are convolutional layers and the great majority of inference time is spent in them. NMNF paper targets devices like hearing aids, earbuds or wearables. Such devices are highly resource constrained, in terms of memory, power, and computation, and therefore benefit from a specialized implementation of convolutional layer introduced in the paper. Inference-time floating point operations are not only energy-hungry compared to integer operations but also computationally demanding. NMNF approach avoids floating point operations entirely and consequently, we can enjoy reduced model size as well.
Read more

Tips for building fast portrait segmentation network with TensorFlow Lite

July 6, 2018 Beomjun Shin, Seungwoo Choi, Hyeongmin Byun, Hyungsuk Yoon, Seokjun Seo, Martin Kersner

Deep learning has led to a series of breakthroughs in many areas. However, successful deep learning models often require significant amounts of computational resources, memory and power. Deploying efficient deep learning models on mobile devices became the main technological challenge for many mobile tech companies. Hyperconnect developed a mobile app named Azar which has a large fan base all over the world. Recently, Machine Learning Team has been focusing on developing mobile deep learning technologies which can boost user experience in Azar app. Below, you can see a demo video of our image segmentation technology (HyperCut) running on Samsung Galaxy J7. Our benchmark target is a real-time (>= 30 fps) inference on Galaxy J7 (Exynos 7580 CPU, 1.5 GHz) using only a single core.
Read more

Low Power Image Recognition Challenge 2018

June 26, 2018 Beomjun Shin, Seungwoo Choi, Hyeongmin Byun, Hyungsuk Yoon, Seokjun Seo, Martin Kersner

Last week (June 18-22, 2018), two members of Machine Learning team from Hyperconnect visited Computer Vision and Pattern Recognition (CVPR) conference in Salt Lake City, Utah. Prior to coming to CVPR, Machine Learning team engaged in one of the challenges called Low Power Image Recognition Challenge (LPIRC), jointly organized by Purdue University and Google.
Read more