Clean up untagged docker images

I usually build a lot of docker images on my laptop, and time over time the images starts eating up my disk space. Here is a quick and simple way to clean up untagged docker images

Basically, it just list all docker images with no tag (<none>) and then remove it. The assumption is untagged docker image is unused and safe to remove. When you build an image with the same tag again, the previous image will not be removed but untagged. So this is a quick way to clean up my hard drive ūüėÄ

Exposing Kubernetes cluster over VPN

Another exercise that I worked in last few weeks is to setup and test kubernetes cluster, and one of the thing that bother me is that i can not access the kubernetes pod and service directly, i have to go use kubectl pod forwarding, but it is really¬†inconvenient. If you are not familiar with Kubernetes, it’s like kubernetes is setup another cluster inside your cluster. You can read kubernetes networking document¬†for more detail. So i set a small challenge for myself to make kubernetes accessible over VPN, this means that once you connect to a VPN gateway you can easily access to pods and services.

The deployment can be summarized as below:

There is one easy is way that you can setup a VPN inside the Kubernetes cluster, then expose that VPN via NodePort. But I want to go the hard way, not because i want hard thing, but this will help me to understand more about kubernetes networking. And it really helped! In summary, there are three steps you need to do: connect your VPN node to kubernetes cluster, connect your vpn node to kubernetes services and adjust your VPN configuration accordingly. To give you more context: I am using kubernetes 1.5.2 on coreos with Flannel network addon, and i am using openvpn for VPN server.

Connect VPN node to kubernetes cluster

Kubernetes setups an overlay network and uses it to manage the pod network. In my case, this is equivalent to connect my VPN node to the Flannel overlay network, which is quite easy. You need to download flannel binary here, and identify the correct etcd nodes and runtime parameters. For me, this translate to this command

Some explanation: I am using multiple etcd servers with custom certs, the prefix is under /cluster.local/network and the VPN address is After running the command, it will take like 10 second, and then I can start pinging the 10.233/16 subnet, which is  the pod subnet of the kubernetes cluster.

Connect VPN node to kubernetes services

After you connect to flannel network, you still can not access the kubernetes services. Why? Because the service cluster ip is a virtual ip, and it is managed by kube-proxy and route via ip table. You can read additional detail in Service document. So what i did is: downloading the kube-proxy and start the kube-proxy on the VPN node. You will need to make sure your kubeconfig is available so that the kube-proxy will be able to connect to API server

After a few seconds, you will be able to access kubernetes cluster ip inside the cluster.

Adjusting VPN configuration

So now from the VPN node, we are able to connect to Kubernetes pods and services. I just need to adjust the openvpn to declare that the kubernetes subnet is available over VPN

However, you will soon notice that from VPN client, you still can not access the service ip yet. The main reason is they are not real ip and instead handled via IP tables. So, what i did is add a SNAT rule to forward the packet so that i will be handled by iptables as following:

Also one note is the ip is VPN node flannel’s ip. Additionally, you can also push the DNS option so that you can access kubernetes dns over the VPN.


After three simple setup (it actually took me almost half a day to figured out the whole thing and how they worked …), I managed to expose the whole cluster over VPN. It is quite convenient since I don’t have to do port forwarding or sock proxy anymore. I believe this is also reusable for your cluster as well, maybe a bit different if you use Weave or other network plugin. I hope you find this helpful and feel free to share your thoughts under comment section.


OpenVPN auth-user-pass-verify example

One of my recent task to enable authentication over OpenVPN. auth-user-pass-verify is one of the way (is it the only way) to enable authentication OpenVPN. When user connects to vpn, the server will write the username and password to a temporary file then execute the script with the file path as argument, the exit code will be used to determine whether authentication is success or not. It is a weird protocol but it is how it works …

To enable authentication, you will need to change your openvpn config to:

Make sure your script is executable. Below is an example bash script is as below

In the example,¬†i simply check if the password is “bao” ūüėÄ . You should replace with your own authentication. Also note that for security reason openvpn has some constraints¬†on the username and password, check here for more detail.

Tensorflow: exporting model for serving

Few days ago, I wrote about how to retrieving signature of an exported model from Tensorflow,  today i want to continue with how to export a model for serving. Particularly, exporting a model and serve it with TFServing. TFServing is a high performance tensorflow serving service written in C++. I am working on building a serving infrastructure, so i have to spend a lot of time on exporting tensorflow model and make it servable via TFServing.

The requirement for an exported model to be servable by TFServing is quite simple: you need to define inputs and outputs named signatures. The inputs signature will define the shape of the input tensor of the graph, and the outputs signature will define the output tensor of the prediction.

Exporting from a tensorflow graph
This is straight forward. If you build the graph yourself, you will have the inputs and outputs tensor. You will just need to create a Saver and an Exporter, then call with the right arguments.

Please see here for a complete example.

Exporting from a tf.contrib.learn Estimator
This is actually more tricky. Even the estimator provides export() API but the documentation is not helpful, and by default it won’t export a named signature so you can not use it directly. Instead, you will need to:

  • Define a input_fn to return the shape of the input. You can reuse your input_fn for data feeding if you have already did that during training.
  • Define a signature_fn as below
  • Make sure you pass input_feature_key and use_deprecated_input_fn=False when you call the export function.

Below is an example of exporting the classifier from this tutorial. Note: this is only for tensorflow 0.11. For 0.12 and 1.0 the api may be different.

Some explanation: in the input_fn you defined the features of your estimator, it will return a dict of tensors to represents your data. Usually this will return a tuple of features tensors and labels tensor, but for¬†exporting you can skip the label tensor. You can refer to here for detail documentation. The above input_fn returns a feature tensor with feature name is empty string (“”). That’s why we also need to add¬†input_feature_key="" to the export function.

Once the model is exported, you can just ship it to TF Serving and start serving it. I will continue with this series next few days on how to run the serving service and sending requests into it.

Tensorflow: Retrieving serving signatures from an exported model

Simple question, but it took me many hours of digging into the code to figure out :-(. Someone should have added a document or something.

Basically, just import the meta graph, then unpack the protobuf object from¬†serving_signatures collection. I really don’t understand why it is not added to¬†signature def. Anyway, later you can just call¬†read_serving_signature(path/to/export.meta) to retrieve the exported signatures. It will be very helpful if you want to implement a generic serving interface for tensorflow.

I also made a gist here for reference.