Chris Mair
Software Development and Training


News from 2021-02-24

Learn PostgreSQL book cover
My colleagues Enrico Pirozzi and Luca Ferrari asked me to review their latest book.
I gladly obliged :-)

"Learn PostgreSQL" is a complete and well written book. Users of all levels from beginners to daily users of Postgres will find this book to be a very helpful guide.

What I like: No concept is missing. Some introductory books barely scratch the surface of the more advanced concepts, such as SQL window functions, how Postgres implements row level security or the inner workings of the write ahead log.

Not so "Learn PostgreSQL". Everything is there.

Despite the breadth of coverage, the book is not just a rehash of the official documentation. Everything is *explained*, not just listed. The examples are carefully chosen to be as simple as possible, but not too simple.

I'm a long time Postgres user, yet "Learn PostgreSQL" thought me some interesting stuff I've overlooked in all these years. I wish I had this book when I started using Postgres!

What I don't like: The preface states chapter 15 (Backup and Restore) would introduce external tools (Barman and pgBackRest), but then doesn't introduce any of the two. I was a bit disappointed until I found pgBackRest is actually explained, hidden away in chapter 19 (Useful Tools and Extensions).

What I would like to see: At 636 pages the book is probably on the larger side, still I think the chapter about server-side programming could see some expansion. Data types are explained well, but the section about PL/PgSQL isn't as comprehensive as other parts of the book.

traffic sign
News from 2020-12-03

Developer's Thursday Meeting (NOI Techpark) on Deep Learning

In this event I showed a few experiments with machine learning.

The example application was classifying traffic signs. The data set was taken from "The German Traffic Sign Benchmark" (GTSRB): it contains 51839 images of 43 different traffic signs from real-world captures (39209 for training and 12630 for the benchmark).

I used GNU Octave to preprocess the images (make the size uniform, stretch the dynamic range, convert to grey scale, store everything in a simple .mat-file).

In the first experiment I took a simple fully-connected network with one hidden layer with 1000 nodes. I implemented the net in GNU Octave using the math from Tariq Rashid's "Make Your Own Neural Network". With this net I could get a ~ 85% correct detection rate in the benchmark.

In the next experiment I used dlib, which is a machine learning toolkit for C++. The same network architecture implemented using dlib gave a correct detection rate of ~ 89.5% with dlib's more sophisticated training algorithm.

In the last experiment I kept using dlib, but changed the network architecture to a "deep" network: I started from a "LeNet" architecture and added a few more layers, mixing fully-connected, convolutional and max-pooling layers. To train this network I used a VM with a Nvidia Tesla V100, which dlib can use for CUDA/CuDNN. This way the correct detection rate in the benchmark increased to ~ 95%. Not bad for just a few lines of C++ code and 30 seconds of training.

To put this into perspective, when the first competition launched in 2011 a score of 95% would score about half way in the field of participants, with the top place reaching 98.98% and beating human performance at 98.81%.

download the source code [zip]

a time series
News from 2020-07-22, updated 2020-08-21

Some (humble) Experiments in Time Series Forecasting

updated presentation [pdf]
updated download of source code, data and images [zip]

(C) 2005-2020 Chris Mair - last changed Feb 24, 2021