Summarize the lessons learned by the developers of the open source Windows application 'SumatraPDF' that celebrated its 15th anniversary



Mr.

Christopher Kowalczyk , the developer of ' SumatraPDF ', which started as a PDF viewer for Windows and has now evolved into a multi-format viewer that can also be used to read e-books and manga, looks back on SumatraPDF, which has celebrated its 15th anniversary. Summarize the lessons learned from

Lessons learned from 15 years of SumatraPDF, an open source Windows app
https://blog.kowalczyk.info/article/2f72237a4230410a888acbfce3dc0864/lessons-learned-from-15-years-of-sumatrapdf-an-open-source-windows-app.html

SumatraPDF is an open source viewer for Windows that allows you to view data in various file formats such as PDF, DjVu , and XPS , from file formats for e-books such as epub and mobi. The code is about 127,000 lines, and the language used is C ++ . The code does not use GUI abstraction libraries like Qt and is written for the Win32 API . 'This allowed us to make the viewer as compact and fast as possible,' Kowalczyk said.

Mr. Kowalchik basically seems to have developed SumatraPDF jointly with another programming companion. We have released a graph that briefly summarizes how much time (vertical axis) has been spent on development in the 15 years from 2007 to 2022, and appeals that we have spent a lot of time on development. In addition, SumatraPDF is a hobby project to the last, and Mr. Kowalchik has a regular job separately.



Mr. Kowalchik explained why he developed SumatraPDF, ``In 2006, I was working at Palm. One of my jobs at that time was to create a PDF reader for

Foleo , a mini notebook PC with ARM and Linux. 'At the time, I didn't know PDF was a popular data format, but Palm management decided that a PDF reader was essential for Foleo. I was appointed as the sole developer in the project to develop a PDF reader.' In addition, Foleo was canceled for each project at the timing near the release, and it did not reach commercialization.

Creating a PDF rendering library requires a huge amount of work, spanning several years. However, Mr. Kowalchik did not have such a huge amount of time, so he decided to develop a PDF viewer using the open source library Poppler . As a result, Mr. Kowalchik says that he will work on creating a 'basic PDF viewer that uses Poppler to render a PDF to a bitmap in memory and transfer that bitmap to the screen.' Since PDF is a complex file format, some PDF viewers are very slow to render, but Jeff Bezos said that 'speed is what customers always care about' was in my head. Mr. Kowalchik says, 'I wanted to speed up the rendering speed and was working on development.'



The development of SumatraPDF started from developing a PDF viewer for Foleo, but it seems that the development tools for ARM hardware at that time were not well maintained. So Kowalchik says, 'I decided to compile Poppler for Windows, which had a decent profiler.' Once you have Poppler running on Windows, create the simplest GUI app that can display pages and navigate between pages. It seems that this was released on the homepage as version 0.1 of SumatraPDF.

It seems that the version of SumatraPDF was 0.1 because it was too simple function, but by releasing it as a product immediately, it was possible to acquire early users immediately, and to know what kind of functions users are looking for Mr. Kowalczyk said he was able to do it. ``Releasing the product early was more effective than struggling for months and years to implement many features.In the first place, no one cared about the simplicity of the product.'' .

In addition, Kowalczyk said he was able to profile documents that were particularly slow to render and implement some surprisingly simple and effective optimizations. He wrote that he maintained high code quality while developing almost alone without reviewing the code and without a dedicated quality assurance team, and adopted the following method.

◆ Test the code yourself
Work with the newly added code in the debugger to ensure that the newly added functionality works as expected.

◆ Automatic crash report
While creating automated crash reports is a daunting task, Kowalczyk says it's the most important thing you can do to improve the quality of your software. By checking the report, you can figure out what went wrong with your code change and fix it.

◆assert()
It's also important to use assert(), a technique commonly used in C++ code, to verify that certain conditions are correct in additional code that only runs during debug builds. Mr. Kowalchik seems to be able to automatically receive bug reports from people who meet the conditions by enabling this on pre-release builds that are not debugged. Mr. Kowalchik said, ``I can verify much more with 1000 people using the app than I can desperately test by myself,'' and says that it is better to involve many people in debugging.

◆Take a log
When investigating a problem, it helps to know what sequence of events led to the crash. SumatraPDF's logging module now logs to a memory block, which is said to be sent with crash reports. In most cases, they don't care about the log, but restarting the app to enable the log is troublesome, so by separating the logging app from SumatraPDF, you can always get the log. It seems that they are doing it. The implementation of the logging application is very easy.

◆ Static code analysis
It seems that it is effective to use tools such as Visual Studio's '/analyze' option,

Cppcheck , Clang-Tidy , and GitHub's ' CodeQL ' to fix errors and warnings.

ASAN (Address Sanitizer)
Added in several point releases of Visual Studio 2019, ASAN can prevent overwriting memory or reading uninitialized memory at a very small performance cost. Even with ASAN enabled, 'it's still fast enough to use as a regular build,' Kowalczyk said.

◆Stress test
Since SumatraPDF mainly renders complex file formats, it seems that it often crashes when trying to open a specific file. In order to ensure that there are no crashes, Mr. Kowalchik seems to have created a stress test code that reads and renders all the files in the directory. And it is said that this code is used during testing before release.

◆ Unit test
It is mainly used for low-level functional testing, such as string formatting, and not so much. Still, 'sometimes we find bugs,' Kowalczyk says.

◆ Memory leak
It's surprisingly difficult to find an easy-to-use memory leak detection tool, so Mr. Kowalchik is developing a very simple leak detection tool himself. In addition, while developing this tool, SumatraPDF seems to use Dr.Memory . However, 'it works, but it's super slow,' Kowalczyk said.



In addition, Mr. Kowalchik said, ``If there was at least one notable improvement, we updated it as a new release version,'' and releases frequently in proportion to the amount of newly written code. argue that it should.

Furthermore, if you want to make an open source project successful, ``It is important to treat the product like commercial software,'' Kowalchik wrote. Specifically, it is important to create and publish a website for software. It seems that SumatraPDF also had a website prepared from the first day of release, but it seems that there was also a voice saying that this site was 'a website made by a 6-year-old child'. Therefore, Mr. Kowalczyk points out that two lessons should be remembered: ``Ignore nasty people and bastards'' and ``Even a crappy website is better than no website.'' In addition, it was also effective to do basic SEO on the website and have a support forum where users can ask questions and send feature requests.

Additionally, simplicity and customizability will be important in software, Kowalczyk said. The importance of simplicity is said to be 'learned from the history of Firefox'. Customizability is also important, but having 100 settings in the settings dialog is 'not a good solution,' Kowalczyk said.

In addition, SumatraPDF has a small file size, can be started immediately, and can operate at high speed, ``It is still a big advantage,'' Kowalchik wrote. To keep SumatraPDF small, Kowalczyk avoids unnecessary abstractions. Developing software for Windows is particularly troublesome. ``There are various toolkits such as Qt , WxWindows , and GTK , and they are very easy to use. , is not recommended for use.

Also, open source is not a good business model, so if you want to make money, you should look for other ways. Mr. Kowalchik seems to have made money by posting advertisements on the SumatraPDF website, but at the time of writing the article, he confessed that he could hardly make any money due to the deterioration of Google AdSense's earnings. In addition, Mr. Kowalchik is soliciting donations on Patreon and PayPal, but although his monthly income is more than $ 100 (about 14,000 yen), it is ``not more than that'', which is insufficient as a source of income. I assume there is. In light of these, 'It's rare that you have the freedom to do whatever you want with a high salary. Choose what's important to you. Open source offers freedom, but money doesn't.' They don't provide it,' Kowalczyk wrote.

in Software, Posted by logu_ii