For the second time and with great pleasure, Streamdata.io was at Devoxx France with a booth and also several talks.
Here is a compilation of what the team has attended and liked.
Day 2 – Conferences
As usual, the conferences day started with keynotes: the theme of 2016 was Society and Software Development.
The keynotes series started with the Devoxx France Team. What is Devoxx France? It’s more and more cfp submissions: 788 subjects in 2016. 230 elected. It’s also more and more attendees: around 1250 in 2012, 1400 in 2013, 1497 in 2014, 2509 in 2015 and 2762 in 2016. What a success! It’s food: 10800 bottles of water, 3600 litres of water, 7000 sponge cakes, 7600 meals, 2600 sodas. It’s also a Café Philoxx to discuss philosophy, a HackerGarten to hack and contribute to open-source projects, a Meet and Greet to exchange, a Seed Networking and a startups village to promote French startups. It’s also a Devoxx4Kids to inspire children to programming, robotics and engineering (it was the day after Devoxx which is organized by the Devoxx4Kids team).
e-health by Jean-Michel Billaut
That was an interesting talk. According to Jean-Michel, from health perspective, humanity has known two great revolutions. The first one is the agricultural revolution: the humans had no more to chase to get food and survive. The second one is cheap food: humanity can get healthy cheap food (at least in our European countries?). The third revolution is coming: it’s the e-health which promises to be in health forever. No more diseases. One of disruptive innovation of the e-health is genomic scissors invented by Emmanuelle Charpentier and Jennifer Doudna. These scissors enables to cut and thus modify DNA with extreme precision: that’s “medicine of precision”.
While other countries invest money (and a lot) such as USA, UK or China with iCarbonX (which aims to sequence all the genomes: not only human genomes, but also animals ones, microbes ones, …), France is reluctant to invest in this next revolution and doesn’t seem to grasp the challenges (both in terms of health and economic innovations). In Great Britain, for instance, there is one initiative: Genomic England. One big cloud to trace and try to cure genetic diseases. In France, 80 data centers in silos. This does not really ease cooperation and efficiency… Still in UK, Finland and USA, you can find videos that explain genetic stuff and promote the genomic research while nothing in France according to Jean-Michel. In USA, 10 billions of dollars have been invested in 1500 e-health startups. A Boston startup offers to sequence your genome for less than 1000 $ with a predictive analysis. In France, it’s strictly forbidden (jail + penalties). From a paradigm perspective, we are switching from a curative medicine to a predictive medicine. And predictive medicine is a new way that can be made possible thanks to big data. Sequencing thousands and thousands genomes and analyzing them can lead to infer new treatments against cancers, tumors, etc. thanks to big data and machine learning.
IoT has a role to play too. With portable spectrometers for instance, people will be able to analyze what they are eating (how many pesticides they eat, etc.) and make some choices for their health. “The human being is math: it’s a percentage of lipides, glucides, etc.” says Jean-Michel. And to conclude that we won’t need anymore 80% of doctors in the future…
StackOverflow by Joel Spolsky
Joel’s talk started with a joke: what if a taxi were designed by a CSS designer? He emphasized that the developers spend 5% of their time to code. The rest of the time (besides endless meeting?) is spent to know why the code does not work the way they want. And to know why, the best way is to ask other developers why your code doesn’t work. And with the rise of the Internet, it became easier to help and share knowledge. And this led Joel to retrace the history of StackOverflow. Before StackOverflow, there was a site that enables developers to ask and answer other developers’ questions but it was not free: people had to pay. Hence, the birth of StackOverflow. A free platform where developers can share their problems and their solutions. As far as the business model, it is based on the reputations of users, the upvotes and the ability to get jobs offers from companies which pay to post the job offers.
Joel talked also about the power of coder. Software is everywhere in our daily life: phones, e-health, TV favorites, Google Maps, IoT, etc. So many lines of code. And despite these lines of code may have been described 4 pages of specs, the one who made the final interpretation and thus the decision is the coder. In short, 1 LOC = 1 decision. Interesting.
Entrepreneurship in the feminine by Natacha Quester-Séméon
The last keynote was about the place of women in the IT world. Natacha reminded us that without Ada Lovelace, who invented the first algorithm to be run by a machine, we may not have computing science. Without Margaret Hamilton, men would not have landed on Moon and been back to Earth. No doubt women have a big role to play in IT. That’s why she fights to promote gender mix, which should be one of the seed to innovation in the digital world. Hence, initiatives to promote women in the innovation worlds and entrepreneurship such as Girl Power 3.0 or the twitter #JamaisSansElles that made the “buzz” and meet a success in the French IT world.
DDD (Domain Design Driven) by Cyrille Martraire
This was an unexpected talk: at the beginning, I believe a talk about DevOps was foreseen. But the DDD talk was really nice. And Cyrille is a good speaker. First of all, Cyrille gave some criteria to choose whether or not use DDD: if the risk is focused on the domain, if the value of your app is the domain, then DDD is what you need. If your app is a simple CRUD, then no need of DDD.
The fundamental of the DDD is the ubiquitous language. This is a language that must reflect the business domain and be used everywhere (in the code, the conversation, etc.). Everybody (coder, analysts, project manager, product owner, etc.) must talk this language. And as it is a masterpiece of the methodology, you have to pay attention to the language you choose. Things should be easy to be named, no abbreviation, one name to name a thing (no synonyms). In short, be careful to your « thesaurus ». In code, one good piece of advice is to generate a cloud words from your classes: if the cloud reveals only technical words, you fail. At the opposite, if it reveals the business domain words, that’s a good point for you. No null in the code in favor of explicit NullObjects and object calisthenics are also good tools to design / name the domain objects. Why is ubiquitous language important? Because it’s a matter of efficiency. Everybody speaks the same language (no need to translate a « business word » into a technical one) and it’s a way to grasp more easily the business domain.
Core domain is also a center part of DDD. The first quality of a DDD aficionado is the domain curiosity. If you wish to apply DDD, you need to grasp the domain you wish to model. If you don’t want to get knowledge about the domain, you will certainly fail. To grasp the domain, one good way is to start with examples to understand the context. And one good friend for that is BDD (Behavior-Driven Development). Once done that, you need to model your domain. To do so, you need to consider that modeling == code. Code should literally reflect the domain. UML or MDA are in this case not useful.
Then, Cyrille gave some toolbox for DDD. First, he talked about what he called « tactical patterns »: Entities, Value Objects and Services (and Repositories). Then, he switched to the « strategic patterns ». Bounded context is one of them. It enables to split a big model into sub-domains. Linguistic concepts may slightly differ. You may also duplicate some code. But that’s for a good reason: no coupling. Indeed, each sub-domains may evolve by the time and as matter of fact, the code of each sub-domains may also evolve and diverge. This is a long-term consideration to have.
That was a nice and living presentation. If you have the opportunity to watch the video, have no hesitation!
Ansible hors des sentiers battus by Aurélien Maury
Ansible off the beaten path
Interesting talk which started with Ansible basic features and best practices. The speaker presented how to access a server behind a proxy with Ansible. Besides, it is possible to have Ansible running as an agent: commit your configuration and pull it regularly with a cron on every machine where your “agent” is installed. There are a couple useful tips that we are going to use at Streamdata.io. There is a Nagios module to schedule downtime before a production deployment (no more manual downtime!). When running an ansible command, it uses the configuration file (ansible.cfg) in the current directory. You can only override role variables with the help of a command line parameter. Note that you can easily generate test machines by combining Ansible, Packer and Terraform.
Finally, Aurélien insisted on the fact that it is really straight forward to develop you own Ansible module (15 lines of python). Ansible is definitely a great tool.
Progressive WebApps by Cyril Balit and Florian Orpelière
Another nice living talk by a dynamic duo! Progressive WebApps is a trending topic nowadays. To master Progressive WebApps, Cyril and Florian decided to re-develop a small in-house app that welcomes new comers at Sfeir. The code of the app can be found here: https://github.com/Sfeir/peoples. The talk was both an introduction and a feedback of what they have learned.
A Progressive WebApp must comply to several properties, it must be:
- progressive: it should work for every user, regardless the browser choice
- responsive: it should fit any device (desktop, mobile, tablet, etc.)
- connectivity independent: it should work offline or with poor network conditions (« lie-fi » opposed to « online » and « offline »)
- up-to-date: it should alway be up-to-date
- secure: it should provide secure content
- discoverable: it should be identifiable as « applications » (see W3C manifest spec)
- installable: it should allow users to “keep” apps they find most useful on their home screen without the hassle of an app store
- linkable: it should be shared easily via link (a URL) without a complex installation process
- app-like: it should feel like an app with app-style interactions and navigation
- re-engageable: it should make re-engagement easy. Push notifications are good way to do that.
Add to that performance concerns: critical CSS vs load CSS, compression, async/defer use, HTTP/2, good cache management (E-Tags, Last-Modified, etc.), no explicit paint.
Through their application made in material design and angularjs, Cyril and Florian illustrated how they did to build a progressive webapp. Service workers have been used to handle the offline mode. Push notifications have been used for the re-engageable feature of the app while W3C manifest has been settled to get a discoverable app. Nice talk, thanks!
Ce que les stratégies de versioning nous disent des dynamiques d’équipe by Benoît Lafontaine and Hervé Lourdin
What versioning strategies tell you about team synergies
Benoît and Hervé reviewed several versioning strategies, showed how to lessen the hardness of merging code and even branchless patterns.
The most interesting patterns were:
- Feature branch
- One branch for one feature
- Short feature
- Feature is merged with master as soon as it is implemented (and tested)
- Git Flow. This pattern is more complete and should be convenient for most of you
- Master is production
- One development branch
- Several feature branches
- One release branch
- One hotfix branch
- Github Flow. Delete release, hot fix and dev branch from previous pattern
- This is mainly suitable for web development (release are not blocked by other things like Apple validation time for instance).
The second part was dedicated to the less or no merge solutions:
- No branch, no merge
- encapsulate your features into a “if” instruction. To be honest, I would not recommend this solution.
- Mob programming so that the whole team implements one user story together. It does have many advantages (knowledge sharing, code robustness…).
- Feature branch + Continuous merge as implemented by lesfurets.com. You will find more explanations here
They added an important comment which is that you must never build an integration team. In that case, review your versioning strategy instead. Agreed.
As a conclusion, I would say that there is no magic solution. A merge will still be a merge with all implications. Nevertheless, there are several ways to enhance you versioning strategy, I bet you could find one in the choices above.
100% Stateless avec JWT (JSON Web Token) by Hubert Sablonnière
I’ve missed this talk several times in different conferences. Devoxx France was the perfect moment to attend it!
Do you know the lemmings game? At the end of a successful level, you got a code to start the next level. Useful because at that time, you didn’t save your game… Yes, it was a long time ago… In this galaxy though… JWT is almost like that. Once a client logs in, the server sends back a JWT token built with a secret. Every time, the client makes a request to the server, it sends the JWT token too. The server checks the JWT signature. Nothing is stored in the server side. This is a stateless process from the server point of view. Compared to other authentication solutions (like the ones based on cookies or stateful session ids) where you need to use some distributed cache or sticky sessions (or both), JWT is a solution that is easier to deploy. Easy load-balancing, multi-language support, good fit with a « micro-services » architecture, are some assets of JWT. No need to have cache synchronization across servers nodes and all the issues that come along! Hubert’s motto: « Stateless is priceless! ».
A JWT token has three parts:
- a header
- a payload
- a signature
This three parts are separated by a “.” and looks like: “xxxxx.yyyyy.zzzzz”. The first two ones are encoded in base 64.
The header defines the hashing algorithm used (e.g. HS256). The payload contains the claims. Some are already reserved as iss for issuer, exp for expiration time, sub for subject. But you can add your own: public claims for public stuff (e.g. your company name, etc.) and private claims used for a given exchange between the two parties. The signature is then computed according to the hashing algorithm, the encoded header and payload and your secret.
JWT can be used for authentication. But there are other usage: passing data between the parts of a multi-parts form, confirmation email, etc.
Nice talk too. Very didactic! Thanks, Hubert!
Systèmes distribués, scotch, bouts de ficelle et doigts croisés : une histoire du Streaming à Criteo by Serge Danzanvilliers and Yann Schwartz
Distributed systems, scotch tape and fingers crossed: a streaming history at Criteo
Probably among the best speakers this year, the presentation was very interesting and captivating, sometimes funny! Criteo has to manage a huge amount of logs every day, minutes and seconds, and they lived for a long time with a home made solution based on a combination of rsyslog to aggregate front (json) logs, gzip to compress logs (because json cannot be transferred without compression) and curl to post data. This sounds archaic but is actually very scalable. As soon as you need to scale up, you just need to add a rsyslog server instance and the load is round robin balanced between every instances in the pool. Actually, the main problem they faced was the ops team scalability. Even if you can scale this architecture quite easily, when a problem occurs, it is really time consuming to troubleshoot, fix etc…
Hence, they implemented a new scalable solution based on Kafka / Zookeeper with all the features associated: caches, buffers, queues. As they say, it gives “an illusion of streaming”. With regards to data loss, they insisted on the fact that you should better lose data than lose a system. By the way, remember that partial data loss is not critical, tell your boss that you are doing sampling
Débridez les performances de vos applications avec Chronicle Queue by Thierry Abaléa and Riad Maouchi
Unleash the performance of your application with Chronicle Queue
This Tools-in-Action was a short presentation of Chronicle Queue, a java library for high performance process. This is like a standard Java Queue but it targets high-performance. As matter of fact, the absence of flow-control is a feature. Chronicle Queue relies on memory map file to implement the queue. All of this gives nice high performance. The aim is vertical scalability: use as best as possible the resources of a machine and thus be able to reduce the number of servers.
Thierry and Riad have seen a high performance app based on Chronicle Queue that does only one minor GC (Garbage Collection) per day. Very impressive. Through an app that manages concerts tickets, they show us some code and how to use this library.
Performance : Java vous déclare sa flamme by Vincent Beretti and Nicolas Peters
Performance: Flame I’m gonna live forever with Java
Flame graph is a way to represent CPU usage. This has been made famous by Netflix which wrote a couple of nice articles about that (here and there). This Tools-in-Action was a good way to understand what flame graphs are and how to generate them. Flame graphs are graphical representations of the CPU usage. By combining several tools (perf, perf-agent and other scripts), you can generate interactive SVG that shows you the CPU usage for each class used in your app. This is a true representation of what happens because this inspection is done at runtime i.e. after JIT optimizations.
There was a demo that illustrated the use of the different tools to generate a flame graph. Good news, there is one that aggregates all of them for Java: perf-java-flames. You need also a recent JDK (>= 1.8.0_66) and run your program with the jvm argument -XX:PreserveFramePointer. This latter should add a small overhead but negligible. Best is probably to have a look to the slides here to get the tools and command lines to use. We will definitely try it at Streamdata.io!
To be continued… Stay tuned, we will soon publish day 3!
This post has been written in four hands: Jean-François Marquet and Cédric Tran-Xuan.