lxndryng - a blog by a millennial with a job in IT

Aug 11, 2017

Weeknotes #2: Consistency

I came into this with the intention of being consistent with producing something weekly, but there has been a number of extenuating circumstances around actually doing this: between interviewing for promotion and planning a wedding, it's been rough so far as actually having the mental capacity to remember to do, well, anything goes.

Maybe I'm just scared of commitment.

Monday: Managing myself

I'm terrible at doing my timesheets: part of this is wilful disobedience purely because I've been told that I *must do them without ever been told of the benefit that me wasting half an hour allocating time to projects could possibly have; the other part is simply me forgetting about the process-oriented parts of my job.

I do do my timesheets often enough to have a favourite time code, though: the incomparable "managing myself." Monday was definitely a "managing myself" day, trying to get all of the paperwork I've been delaying around changing my name at work and changes to my pension that I haven't had time or been able to get done. Not exciting, but necessary.

Tuesday: Investigation

One of my larger pieces of work at the moment is helping our compliance investigation division to define a flexible cloud environment that meets some fairly stringent legislative requirements around handling data, as well as this data potentially harmful materials, be they malware or in the realms of indecent or illegal content.

Only a small challenge, then. Certainly an interesting one.

Designing this in a greenfield environment, I certainly would agree with the approach that had been defined by the contractors employed to get this work going: their approach was dairly elegant, with isolation being provided by the use of dedicated VPCs in AWS; a key management service that wasn't reliant on KMS; and a development environment for data scientists that would prove incredibly flexible. We don't live in an ideal world, though: we have a model for the extension of our datacentres into the cloud that means that certain aspects of this design are, by necessity, delegated to another body in the department.

A day of whiteboarding later (I really should buy my own whiteboard markers - I've never been able to pick one up in a room and have it write well), a lot of explanations around why we designed the cloud networking the way we did; why we chose a vendor-agnostic strategy for any PaaS deployments; why complete control of the network environment couldn't be held by the project; why we mandate the technical controls that we do and we came to a compromise position that we believe will work, with work starting next week to spike this.

These sorts of results are what I really enjoy about my job: it's rare that contractors have any respect for permanent civil servants when it comes to anything technical (and our management doesn't help that - but that's really a subject to be discussed over a few drinks and not somewhere I could held to account for what I say), but having such a constructive conversation is wonderful.

Wednesday: "How do you know what's up or down?"

Partially on leave, partially trying to sell a customer on a monitoring solution for a large and complex distributed service, Wednesday was a bit of a wash.

The monitoring solution I'm trying to define for this service is premised on a few core tenets:

  1. Don't give a large vendor any money, please
  2. Take a modular approach so we can more easily move with the times as certain components become outmoded
  3. Don't use Oracle Enterprise Manager if you can help it
  4. Rely on guages, counters and log data rather than runtime instrumentation to get the required data

I concede fairly frequently that I inhabit a position in my job that is somewhat divorced from the reality of delivery: I sit in my ivory tower and declare what, from a corporate point of view, is and isn't acceptable. I welcome input from anyone willing to talk to me, but I have decision-making responsibilities wider than any one project or programme, which means that I don't necessarily have to worry about the first-adopter penalties associated with potentially corporate-wide services or cost of initial development. The programme are fairly resistant to any approach that isn't 'throw money at the problem, make it go away', but I think we've made some progress insomuch as they've at least agreed to do a technical spike for the stack (Sensu, InfluxDB, ELK for anyone interested).

This programme, however, is very much the opposite of my Tuesday conversation as far as the contractors go: there's a far more bullish, borderline xenophobic attitude to anything that is considered 'outside' of programme delivery. That certainly is a pattern of behaviour that concerns me: I'm all for camaraderie, but that sort of negative cohesion always worries me.

Thursday: An overdue team meeting

I'm lucky to work within the calibre of team that I work in: everyone is supportive of everyone else and we're generally left to do our own work and follow our own passions where time allows. We're in the process or revising how we make our work product (solution designs, technology catalogues, deployment patterns, best practice) available outside of our team and more widely across the department, and the technology catalogue application I've developed seems to have gone down well across our trial users and has cut down on a lot of low-value communication. I feel like we're at the vanguard with this sort of stuff as far as the the wider group's work goes, so that's certainly a good feeling.

Friday: Bringing MIS to the 21st Century

A fairly slow day, but I think I may have designed something to replace an ageing operational monitoring platform with something a lot cheaper and a lot more responsive. Hopefully, this will replace my party piece in interviews of 'I delivered a project ahead of time and below cost in a government department once, who else has done that?' Time will tell.

Media consumption

Music: Russian Red, Chelsea Wolfe, Sarah Fimm

Games: DOTA2, Guilty Gear Xrd Revelator, Battle Bakraid

Thought for the week

Repeating myself gets results eventually: it's recognising cultural differences between Telford and Southend that will speed that up in future. We're a weird organisation.

Jul 21, 2017

Weeknotes #1: An aide for a failing memory

As far as civil servants go, I'm probably in the bottom 10% as far as being able to recall things that I've actually done in my work, whether this be for my own benefit or when it comes to dealing with the arcanery of our performance mangement processes: I'm sick of the bar for my acheivements being 'look, [manager's name], I haven't killed anyone and no one's complaining about me, so I'm probably doing OK, right?' The movement around the production of weeknotes by people in government seems as good an excuse as any to start making these sorts of notes. Being the fiercely 'indie web' boy that I am, I have to host mine myself, of course.

Monday: "It's just a database"

Three years ago, I worked in HMRC's Application Architecture space, which meant that I dealt with our SAP estate a fair amount. SAP's products are big, expensive black boxes that handle numbers and spit out the department's accounts and while I do work for the tax authority, this isn't something that particularly gets my blood up. Moving to Infrastructure and Security was very well welcomed given my basement-dwelling proclivities towards IT-that-supports-IT: I love orchestration, I love automation and I love the software development lifecycle - colour me a deviant if it seems appropriate.

I still get pulled into discussions about infrastructure for SAP products on the estate as I'm still fairly close to that team, and Monday saw a day-long session of SAP extolling the virtues of their HANA platform, all the while reminding us "it's not just a database, it's a platform." The product roadmaps that SAP have for all of their systems means that our eventual adoption of HANA is inevitable, but I can't say I'm convinced by anything in their pitch other than "in-memory means faster." Given how long some of the reports take to generate from Business Objects, that will be nothing but a boon to us, but the other benefits that they try to tie to HANA (primarily much needed UX improvements) seem to just be coupled to push the adoption of HANA rather than real need.

That said, pulling away from a database schema that has its roots in the 1970s is probably a wonderful thing for them.

Tuesday: Monitoring

One of my core responsibilities is strategy for monitoring of live systems across HMRC - something that any organisation could probably do with doing better - but it does feel like I'm having the same conversations over and over again: we've identified toolsets that cover probably 90% of monitoring use cases, and we're working with our delivery partners to identify and address the remaining 10%, but the work feels stalled due to funding concerns. I guess this is just working in government.

The thing the galls me about this is that I'd be really excited about delivering this, if only we could just push ourselves to get there. It's important work that could have profound effects on how we work and how well we work.

Wednesday: Risk

We're also looking to do some innovative things in the risk and investigation space, providing HMRC with a platform (not just a database) for analysis of datasets of interest. Of course, this comes with some fairly stringent security requirements and operational concerns given that the organisation within the department that will be undertaking this work sits outside of our IT delivery function.

There had been some consternation about where this environment should sit and whether the services offered by our nascent Cloud Delivery Group would be appropriate for the use case at hand. Our 'corporate' infrastructure services (those operated by the Cloud Delivery Group) are little less flexible than those one could get with a credit card and your Cloud Provider of Choice, but they come with the benefit of inheritance of the security features we already have on the estate, whether that be the corporate anti-virus/anti-malware service, network security appliances or integration with corporate directory services. Visibility of spend is also a key concern in this space.

It was an incredibly productive conversation, in which (to all appearances, at least) I managed to assuage the concerns that the delivery guys had about using the service, addressing their primary concerns around the agility they'd have in delivery and the level of network isolation they could have while still having access to corporate services. It did also show that we might have some communication issues around how we're publicising the services that we offer from Cloud Delivery Group, so that's also something we can address in the future. The strength of the corporate yoke is something that I'm particularly concerned is being overstated: I for one want our delivery people to be able to do whatever it is that they need to do (within the bounds of reason, of course).

Thursday: Not-so Active Directory

We're big fans of Single Sign-On in HMRC, with the general diktat being "if it can use SSO, it should." In an organisation our size with our access control concerns, it makes sense: try managing 65,000 user accounts across multiple systems, you'd definitely be creating a cottage industry. We're trying to get an SSO solution working for our multi-cloud brokering system (which I can't name even though there's an event at which we're speaking in September, I believe...) and it's making visible some interesting issues we have with how we deal with identity - mainly how we can't identify from AD which business unit someone belongs to.

We came to a solution and it shouldn't be too messy, even if we're not automatically identifying which business unit a person belongs to automatically - but we can use our existing solution for role management, so I'll take that as a win.

This is probably my favourite bit of work at the moment, even if it's not what I said I wanted in last year.

Friday: Press F5 to Continue

I had a meeting with F5 to discuss how we'll be handling perimiter security for HMRC's new network design (see the YouTube link above for some more detail about this) and how it can be automated. I was asked how much I knew about F5's automation technologies and, without thinking or missing a beat, I answered "it costs money, so I know next to nothing about it." Room erupts; I'm told I "really am the posterboy for the new civil service." Cheers, I think? It was probably the comment combined with my now-standard fly kicks.

I sacrifice foot comfort for no supplier.

It was a genuinely interesting meeting and has given me some ideas for how we can safely delegate perimiter network security to a level that shouldn't impede development, particularly for our digital platform. All very exciting.

Media consumption

Music: Zola Jesus, The Jezabels, Godspeed You! Black Emperor

Books: From Third World to First - Lee Kuan Yew

Games: Mainly Street Fighter V, I'm still terrible at it.

Thought for the week

It's definitely just a fucking database.

May 01, 2017

The Need for Banking APIs

I've misspent a bank holiday weekend trying to make it a little easier for myself to manage my money without having to turn to a plethora of different devices for different pieces of information to do so. The workflow that I have at present involves a mobile phone, a keyfob and between three and five passwords stored in a variety of password managers: clearly, this setup is not something that I particularly ever want to deal with when I just want to quickly check up on my investments or handle the "oh, I've just been paid, I should do my monthly financial tasks" inevitability at the end of each month. I've developed automation for Hargreaves Lansdown, but the security policies of the other organisations I perform financial transactions with don't permit me to take control of my financial affairs in an automated way.

The UK context for opening up banking data

People have been making noise about the lack of APIs in banking, with Payments UK establishing the Open Banking Implementation Entity to develop a set of standards that would agreeable to banks operating in the UK.

This entity hasn't published meeting notes since October 2016, so who knows what's happening in that space now - given that is was an initiative involving the only people whose IT moves slower than that of government, probably nowhere.

This does seem to be a little bit of a deaf, dumb and blind approach though: projecting massively onto the rest of the population, I don't necessarily need something fancy in this space. The majority of banks provide exports to comma-separated values, Quicken and Microsoft Money formats, which I can then readily interrogate for any information. The issue is that I usually would have to navigate an online bank account interface that hasn't been updated since HTML tables were considered gauche, and I'd have to handle the authentication step of using a multi-factor authentication token, something that can't readily be abstracted away from the concrete implementation of each bank's token generator.

In terms of what is 'real' in this space at present, the API offerings are generally limited to a branch locator, an ATM locator and a product search API (as implemented by RBS and HSBC's banking brands). I appreciate the opening up of this data, but I can already obtain this location data from the Google Maps API and I don't really want to (as an end-user) automate my product selection, given how creative with the truth banks can be about what is truly offered. There seems to be such a gulf between what customers really need, as opposed to what the minimum points of contention between the banks could be. Of course, this is just the cost of trying to get the elephants of the financial sector to move away from the oases they've always known.

So why not do it manually?

I don't want to.

I guess that is the crux of it: I could manually go into the portals of each of my financial service providers and fetch a CSV file, put it somewhere and process it in any way I choose. But I don't want to. I don't want to be beholden to what financial service providers feel I should be able to do with my financial data. Of course, that's always the cost of doing business with anyone, but that would never stop me from being sore about it.

The spectre of multi-factor authentication and corporate inertia

Large corporates aren't the smartest when it comes to security in their customer-facing applications, and I think it would be naive to assert that large financial institutions would be immune to either 1) outright stupidity, as in the linked examples, or 2) groupthink that serves to permeate the entirety of a profession within an organisation.

In the context of multi-factor authentication used by banks, (2) is far more likely to be an issue in providing a good, automatable and secure API service to customers. The typical enterprise "these are the processes we have, they are immutable" inertia and subsequent ennui would be likely to set in: our current service is 'secure', so why would we do anything else? I've seen this time and again throughout my career and it seems to be something that no large corporate is immune to.

The hope that we have to have here is that someone explains how the likes of Amazon's IAM, OAuth or any number of other token-based authentication methods work. I've never had any more faith in an mobile app-based multi-factor authentication token generator than even the most simple of JSON Web Token generators, so hopefully others could come around to a similar realisation.

Is there hope for the future?

As far as I can see, my hopes are all pretty much in one basket, and it's not one I'm comfortable with: I'm not one to pin my hopes for change on a so-called 'disruptive' startup; and I'm certainly not one to hope for 'market forces' to pressure the larger players to compete with relatively niche service offerings. That said, Monzo recently being given a banking license, combined with the commitment to their APIs and integration platform that they've demonstrated throughout Beta, does give me some hope. If nothing else, it is a differentiator which may shape the choices I make over who I bank with.

On the investments front, not even Nutmeg appear to want to do anything in terms of exposing APIs to customers, so I may just have to make do with my own wranglings in that space.

Apr 12, 2017

Using physical devices in VirtualBox

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

Sometimes it may be useful to, for example, access a physical Linux installation on a device from within another physically-installed operating system on the same device. Fortuntately, this is possible with the VBoxManage command for VirtualBox. An example of such a command is given below:

VBoxManage internalcommands createrawvmdk -filename /path/to/file.vmdk -rawdisk /dev/sda

On Windows, the argument for the -rawdisk switch should take the form of \\.\PhysicalDrive0, where the disk identifier can be found using diskpart's LIST DISK command. The VBoxManage command needs to be run as Administrator, with any virtual machines to be launched using the VMDK produced also requiring VirtualBox to be run as an Admionistrator in order to be able to access the disk.

Mar 12, 2017

What is the "Right" Tool for the Job?

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

I doubt I'd find anyone willing to say that management and governance processes should get in the way of teams delivering anything - not even I would deign to do so in my most contrarian moments. Decisions should absolutely be made in the right place - with the teams who have done the work on what it is that they're delivering - doing anything else will lead to nothing other than something that doesn't at all meet the expressed needs of users. I've seen this point expressed articulately in a number of places, but most recently by Daffyd Vaughn in his piece 'Create the space to let teams deliver'. Daffyd makes some excellent points about making the governance fit the crime and making sure that teams are multi-disciplinary in order that things can actually get done with at least less-imperfect knowledge from the people working on them. His discussion of tools, however, is more contentious from my point of view.

The background

I'm that rare beast from the point of view of modern 'digital' staff: I have a responsibility for a lot of technology that we deliver internally, whether that be continuous integration tooling, configuration management tooling or collaboration tooling for developers. The type of users I have raises a couple of interesting issues. My users are:

  • Technical
    • Technical users will always think they know best what tools will meet their needs - I'm projecting here, more than anything, because I know I'd be the same in their position.
  • Service-focussed
    • My users sit within delivery groups focussed on the provision of a certain type of service (eg, middleware, web-facing frontends): it would be unreasonable to expect them to have the time or incination to care about enterprise-wide concerns.

This combination of technical knowledge and what could uncharitably termed 'myopia' lead to a more complex environment for tools to be deployed into than may be hinted at in Daffyd's piece.

Laissez-faire tool selection in the enterprise context

Where teams are small and self-contained, a laissez-faire 'get what works' approach will probably be fine: concerns over integration between teams and worries about data siloisation probably aren't atop the list of concerns of anyone in the delivery organisation. In the context of an organisation of 60,000+, managing the data of tens of millions of customers and hundreds of projects across multiple service-focussed delivery groups with myriad interdependencies, those things start to matter a lot more. If Delivery Group A, for example, has chosen Jira to manage their work, and Delivery Group B makes the choice to use Tuleap, Project A+B turning up and requiring them to work together is going to involve either 1) one team giving up its preferred tool for the project (a bad thing as it's being argued that 'getting out of the way' in the tools space is the right approach, and someone is 'getting in the way' by suggesting they use another tol) or 2) the duplication of data and all sorts of issues knowing what the truth is between the two tools.

There has to be a central function responsible for making corporate technology choices where it is expected that similar functionality will be required across the organisation across an indefinite time horizon to prevent the sort of siloisation that one would otherwise expect.

The need for (light-touch) control

While this may not be the case everywhere, when designing services for HMRC, I have to be cognisant of the Public Sector Network (PSN) Code of Connection. Paritularly poignant in the context of giving people carte blanche to select their tools from the widest marketplace in section '1d. Protective monitoring and intrusion detection':

"If you are consuming Software as a Service (SaaS), you should consider how you will be able to monitor for any potential abuse of business process or privilege."

Given this constraint, and the potential risk that a service consumed on a device connected to the PSN could pose to the wider network, the idea of providing unfettered access to any given tool is one that simply cannot fly. In the case of tools such as Slack, where very little information is provided by the provider - no matter how much you pay them - as to how they would handle a data breach or any other significant incident.

Some enterprise-level management of risk around tools needs to be undertaken to ensure that it is appropriate for use in the context it is intended to be used. No matter how multi-disciplinary a team is, it will require some assistance from people external to that team to give the present enterprise-wide view on a given technology and it will be required for those at the coalface to inform that strategic view.

There are a number of points that can be made here around people and process considerations (such as telling people not to post sensitive information on SaaS), but technical controls are king here: Daffyd's point about making sure firewalls don't prevent access to collaboration tools ignores the complex policy and legal landscapes that exist, such as the obligations placed upon HMRC employees by the Commissioners for Revenue and Customs Act 2005 around not publishing taxpayer information - an error on the tool supplier's side, coupled with a lapse in judgement of a member of staff could lead to a jail term for them, in spite of a lack of intent to violate their obligation.

I don't think it's a case of the legislation being at fault here: this chain of events would definitely be a feature rather than a bug when it comes to protecting citizen data. This is definitely something to be mitigated through sensible tool choice and managemnet rather than the blunt instrument of law.

Is a service is key to you, why would you trust someone else to run it?

No matter how widely-used, no matter how resilient the design, no matter how proscriptive your Service Level Agreements with the provide are, the use of SaaS tool will bring with it risks around availability and the perennial question of "what do I do if this goes down?" Of course, this is the case with any digital service, but at least if the service is managed in-house, I have a building and a desk number to go and visit whoever is responsible for the downtime. A large multinational corporation, on the other hand, does not care about you, no matter how big you are, in UK Government, and it bemuses me how often smart people delude themselves into thinking otherwise.

Tools for your own delivery people should be delivered by people in-house, not only for the above reasons, but also for the inculcation of a spirit of camaraderie and mutual understanding between tool providers and core delivery functions.

A corporate managed service for delivery needs

I am absolutely not saying that delivery teams should not be given the tools they need, nor am I saying that they shouldn't play a role in the definition of the tooling that they use: both of these things are key for anything to be delivered. This isn't necessarily as easy as 'not getting in the way', however: 'not getting in the way' could have some terrible unintended side-effects.

A central technology function should absolutely 'get in the way' and make sure that the organisation provides its delivery functions with what they need, with that offering being provided as a platform intended to be constantly improved upon based on input from its users. Whim and fancy cannot rule at enterprise scale: user needs in the tooling space need to be taken together and a service provided that allows people to do their jobs, while maintaining the integrity of the organisation's data assets and security policy.

Mar 11, 2017

Building a Docker Container for Taiga

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

It's no secret to people who know me that I am not the most organised person in the world when it comes to my personal life: far too often, things can wait until... well, until I forget about them. As part of a general bit to be more proactive about the things I want to get done in my free time, I had a look at the open-source market for open-source project management software (the people who use Jira at work extensively always seem to be the most organised, but I'm not paying for this experiment to look into my own sloth) and came out wanting to give Taiga a try, with it being a Python application that I'd be able to extend with a minimum of effort if there was some piece of obscura I'd wish to contribute to it. Of course, my compunction towards self-hosting all of the web-based tools I use meant that the second half of the question would be to find a means by which I could easily deploy, upgrade and manage it.

Enter Docker. I'd initially found some Docker images on Docker Hub that worked and in a jovial fit of inattentive, proceeded to use them without quite realising how old they were. Eventually, I noticed that they were last built nineteen months ago, for a project that has a fairly rapid release cadence. Fortunately, the creator of those images had published their Dockerfiles and configuration on GitHub; unfortunately, however, that configuration was itself out of date given recent changes in the supporting libraries for Taiga. The option of looking for other people's Docker containers, of course, did not occur to me, so I endeavoured to update and expand upon the work that had been done previously.

Taiga's architecture

Taiga consists of a frontend application written in Angular.js (I'm not a frontend person - I couldn't tell you if it was Angular 1 or Angular 2) and a backend application based on the Django framework. The database is a PostgreSQL database, nothing really fancy about it.

A half-done transformation

Looking at the code used to generate the Docker images, I noticed that there was a discrepancy between several of the paths used in building the interface between the frontend and backend applications: in the backend application, everything seemed to point towards /usr/src/app/taiga-back/, whereas in the frontend application, references were made to /taiga. This dated from the backend application being built around the python base image, before being changed to python-onbuild. The -onbuild variety of the image gives some convenience methods around running pip install -r requirements.txt without manual intervention, which I can see as a worthwhile bit of effort in terms of making future updates to the image easier. Unfortunately, it does change the path of your application: something that hadn't been fixed up to now. Fortunately, a trival change of the frontend paths to /usr/src/app/taiga-back solved the issue,

Le temps detruit tout

Some time between the last time the previous author pushed his git repository to GitHub and now, the version of Django used by Taiga changed, introducing some breaking module name changes. The Postgres database backend module changed from transaction_hooks.backends.postgresql to django.db.backends.postgresql, with the new value having to be declared in the settings file that was to be injected into the backend container.

Doing something sensible about data

Taiga allows users to upload files to support the user stories and features catalogued within the tool, putting these files in a subdirectory of the backend application's working directory. Now, if we're to take our containers to be immutable and replacable, this just won't do: the deletion of the container would result in the deletion of all data therein. Given that the Postgres container was set up to store its data on the filesystem of the host, outside of the container, it's a little odd that the backend application didn't have the same consideration taken into account. By declaring the media and static directories within the application to be VOLUMEs in the Dockerfile resolved this issue.

Don't make assumptions about how this will be deployed

In the original repository, the ports and where HTTPS was being used for communication between the front and backend had been hard-coded into the JSON configuration for the frontend application: it was HTTP (rather than HTTPS) on port 8000. Now, if one was to deploy this onto a device running SELinux was the default policy, setting up a reverse proxy to terminate SSL would have been impossible because of the expectation that port 8000 would only be used by soundd - with anything else trying to bind to that port being told that it can't. To remedy this, I made the port aprotocol being used configurable from environment variables at the time of container instantiation.

Upgrades

The repository put together previously contained, as well as the Dockerfiles for generation of the images, scripts to deploy the images together and have the application work. It did not, however, have any cconsideration how an upgrade could work. With that in mind, I put together a script that would pull the latest versions of the images I'd put together, tear down the existing containers, stand up new ones and run any necessary database migrations. Nothing more complex than the below:

#!/bin/bash

if [[ -z "$API_NAME" ]]; then
  API_NAME="localhost";
fi

if [[ -z "$API_PORT" ]]; then
  API_PORT="8000";
fi

if [[ -z "$API_PROTOCOL" ]]; then
  API_PROTOCOL="http";
fi

docker pull lxndryng/taiga-back
docker pull lxndryng/taiga-front
docker stop taiga-back taiga-front
docker rm taiga-back taiga-front
docker run -d --name taiga-back  -p 127.0.0.1:8000:8000 -e API_NAME=$API_NAME  -v /data/taiga-media:/usr/src/app/taiga-back/media --link postgres:postgres lxndryng/taiga-back
docker run -d --name taiga-front -p 127.0.0.1:8080:80 -e API_NAME=$API_NAME -e API_PORT=$API_PORT -e API_PROTOCOL=$API_PROTOCOL --link taiga-back:taiga-back --volumes-from taiga-back lxndryng/taiga-front
docker run -it --rm -e API_NAME=$API_NAME --link postgres:postgres lxndryng/taiga-back /bin/bash -c "cd /usr/src/app/taiga-back; python manage.py migrate --noinput; python manage.py compilemessages; python manage.py collectstatic --noinput"

GitHub repository

The Docker configuration for my spin on the Taiga Docker images can be found here.

Mar 03, 2017

Building a Naive Bayesian Programming Language Classifier

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

GitHub's Linguist is a very capable Ruby project for classifying the programming language(s) of a given file or repository, but struggles a little when there isn't a file extension present to give a first initial hint as to what programming language may be being used: given this lack of an initial hint, none of the clever heuristics that are present within Linguist can be applied as part of analysis of the source code. As part of a project I'm working on at the moment, I have around 32,000 code snippets with no file extension information that I'd like to classify, with the further knowledge that some of these snippets may not be in a programming language at all, but rather a natural language or maybe just be encrypted or encoded pieces of text. Applying the Pythonic if it quacks like a duck, it's a duck approach, a naive Bayesian approach whereby we just see if a snippet looks like something we've seen in another language seems like it might work well enough.

So why a Bayesian method?

In the main: I'm lazy and not a particularly mathematically inclined person. I also wrote half of a dissertation on Bayesian methods as applied to scientific method, so I've got enough previous in this space to at least pretend I've got some background in the field. On top of that, Bayesian classifiers give us an easy way to assume that the incidence of any evidence is independent of the incidence of any other. We end up with a fairly simple equation for finding the probability of a given programming language given the elements of language we have in a code snippet.

P(Language|Snippet n-grams) = P(Snippet n-gram(1)|Language) * P(Snippet n-gram(2)|Language) ... P(Snippet n-gram(n)|Language)
                              -----------------------------------------------------------------------------------------------
                                          P(Snippet n-gram(1)) * P(Snippet n-gram(2)) ... P(Snippet n-gram(n))

We end up with very small numbers here, so much that we get floating point underflow. To avoid this, we can use the natural logarithms of the probabilities on the right hand-side, and add rather than multiply them.

How do we identify languages?

Linguist has six so-called "strategies" for identifying programming languages: the presence of emacs/vim modelines in a file, the filename, the file extension, the presence of a shebang (eg, !#/bin/bash) in a given file, some predefined heuristics and a Bayesian classifier of its own, though with no persistence of the training data across runs of the tool. In this approach, we'll only be implementing the classifier, but using heuristic-like methods to supplement the ability of the model to accurately identify certain languages.

The first element of the classification model will be based upon n-grams, where n will be between 1 and 4. I want to be able to classify on the basis of single keywords (eg, puts in Ruby), as well as strings of words (eg, the public static void main method signature in Java).

At the core of this, we have a very basic tokeniser that should give us enough information to put together some tokens that will give us enough to go on and create the n-grams that will give us the ability to infer the language code snippets are written in.

A simple improvement on this would be to remove anything that would add plaintext to the mix: comments, docstrings and the like. As I said above, I'm not really concerned with 100% accuracy: something that quacks like a duck might be enough for us to say that it's a duck here.

Languages I have to deal with

From a cursory look at the 32,000 snippets, I know that I definitely have to be able to identify and distinguish between Python, Ruby, C#, C, C++, x86 (I think, this could be a rabbit hole and a half to go down) assembly and Java. We can reasonably expect that differentiation between Java and C# and C and C++ will be painful and prone to error until we refine the model given the similarities that these languages have to one another.

To start, I will just be attempting to demonstrate that my broad approach works with at least Python, Ruby, Assembly, C# and Java before looking to incorporate more languages.

Persistence of the probability data

With a Bayesian approach, we need to be able to refer to a trained model of what the probabilities of given features are for a given programming language in order to make a prediction of which programming language a given snippet will be. In order to do this, we need to store this probability information somewhere. For the sake of simplicity, I'll be doing this in MariaDB, with the basic schema below:

DROP DATABASE IF EXISTS bayesian_training;
CREATE DATABASE bayesian_training;
USE bayesian_training;
CREATE TABLE grams(
    id INT AUTO_INCREMENT PRIMARY KEY,
    gram VARCHAR(100) UNIQUE NOT NULL
);
CREATE TABLE languages(
    id INT AUTO_INCREMENT PRIMARY KEY,
    language VARCHAR(20) UNIQUE NOT NULL
);
CREATE TABLE occurences(
    gram_id INT NOT NULL,
    language_id INT NOT NULL,
    number INT NOT NULL,
    PRIMARY KEY(gram_id, language_id),
    FOREIGN KEY(gram_id) REFERENCES grams(id),
    FOREIGN KEY(language_id) REFERENCES languages(id)
);

Training of the model

To train the model, I used the following codebases:

  • Python
    • Django
    • Twisted
  • Ruby
    • Sinatra
    • Discourse
  • Java
    • Jenkins
    • Lucene and Solr
  • Assembly
    • BareMetalOS
    • Floppy Bird
  • C#
    • GraphEngine
    • Json.Net
    • wtrace

These are, in the realms of real-world code usage, pretty small samples to be going on, but should hopefully give us enough to get a system together that works.

How effective was our initial model?

In order to test how well we did, I tested the following files against the model:

The results:

linguist.rb: [(7953, 'asm', -6416.136889371387), (8002, 'c#', -6630.869975742312), (3931, 'java', -6849.643512121844), (1, 'python', -6302.470348917564), (1763, 'ruby', -5991.090879392727)]
ZipFile.cs: [(7953, 'asm', -164549.47730156567), (8002, 'c#', -144878.96700648475), (3931, 'java', -152243.66607448383), (1, 'python', -158673.75993403912), (1763, 'ruby', -159188.1657594956)]
flask/app.py: [(7953, 'asm', -189603.1365282128), (8002, 'c#', -195084.66248479518), (3931, 'java', -196401.08214636377), (1, 'python', -171435.95779745755), (1763, 'ruby', -183980.2802635695)]
tetros.asm: [(7953, 'asm', -17961.240272497354), (8002, 'c#', -28535.4183269716), (3931, 'java', -28894.289472605506), (1, 'python', -28315.088969821816), (1763, 'ruby', -27569.732161692715)]

For all of the tested files, the maximum value of the logarithms is the language that we knew it was: at least we're getting the right answers, for the most part.

Technical niggles

The way that they're constructed, the databases queries used in the training stage can become incredibly large. These queries can be too large for the default value of max_allowed_packet of 1MB in my.cnf. Setting this to 64MB was sufficient to have all of my queries resolved.

Code

The code used for this classifier can be found at GitLab. This may also be released to PyPI at some point.

Feb 13, 2017

ASUS Zenbook Pro UX501VW configuration for Linux

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

Never trust laptop OEMs if you want to run Linux on a laptop. Well, maybe the more sensible option is to buy laptops from vendors who explicitly support Linux on their hardware (the Dell XPS and Precision lines are supposed to be good for this, as well as the incomparable System76. All of this said, I own an ASUS Zenbook UX501VW and it is a good machine, just a little tempremental when it comes to running Linux, expecially compared to my Lenovo Thinkpad X1 Carbon. Hopefully the following misery I went through will be of use to someone else with this laptop.

Graphics issues

Most people, upon booting any graphical live CD/USB will be greeted with the spinning up of their laptop's fans followed by a hard lock up. Probably surprising to no one, this is an issue with the Nvidia switchable graphics: some ACPI nonsense occurs if the laptop is started with the Nvidia card powered down. There are two options for getting around this:

1. Disabling the Nvidia card's modesetting altogether

To do this, you need to set the kernel option of nouveau.modeset=0. The card will then not have modesetting enabled and therefore will not cause an issue once X loads.

2. Making it seem like you're running an unsupported version of Windows

This is witchcraft and I make no claims to understand how it works, but setting the kernel option acpi_osi=! acpi_osi="Windows 2009" stops the issue that causes the X lockups that occur usually.

Backlight keyboard keys

To enable the keyboard buttons for brightness adjustment to work (and brightness adjustment at all in some cases), the following kernel options need to be specified.

acpi_osi= acpi_backlight=native

These options aren't compatible with the second option above, so pick being able to do CUDA development on a laptop (come on, now) or being able to change the brightness. It was an easy enough choice for me.

Touchpad issues

This is a matter of luck: some of the models designated UX501VW has a Synaptics touchpad and they will work brilliantly out of the box. If you're a little more unfortunate, you have a FocalTech touchpad - a touchpad that only this and a couple of other ASUS devices have. A quick way to tell is to test two-finger scrolling: if it works, you have a Synaptics touchpad - enjoy your scrolling. If it doesn't, you probably have the FocalTech.

There is, however, a DKMS driver available for this touchpad which is targeting inclusion in the mainline kernel. It might take a while to get there, but it will be support by default soon enough. In the interim, cloning the git repository linked above and making sure you have the prerequisites installed (apt-get install build-essential dkms for Debian/Ubuntu-based systems) and running ./dkms-add.sh from within the directory should be enough to get you going.

Every time your kernel updates, you'll need to re-run the ./dkms-add.sh.

Feb 09, 2017

Setting up open-source multi-factor authentication for Amazon WorkSpaces

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

AWS's Identity and Access Management (IAM) is a wonderful service that allows its users to leverage an incredibly powerful suite of fine-grained access controls to really implement the principle of least privilege to secure services hosted with AWS. It also happens to have a fairly simple multi-factor authentication (MFA) approach that uses the popular OATH TOTP (Time-based One Time Password) standard implemented across a number of virtual and hardware token generators.

Amazon's Desktop-as-a-Service WorkSpaces product, however, departs from this inherently de jure and de facto standards approach through TOTP and IAM for multi-factor authentication, rather leaving the users of the service to use a second factor of their choosing, as long as it can be handled through RADIUS: a problem which has recently caused me some issue at work. In the hope that no one should have to deal with this in the same way I did, it's probably worth going over where the issues start and how they can be addressed. It's probably just best to ignore the fact that RADIUS is a technology best left in the nineties.

WorkSpaces authentication architecture, in broad strokes

To use multi-factor authentication at all, it is necessary that the directory service that supports the WorkSpaces that one wishes to enable multi-factor authentication be a Microsoft Active Directory (AD) instance - if you were using Amazon's Simple Directory services, you're going to have to start again. In order to leverage an external AD instance, it is necessary to deploy an Amazon service called AD Connector into two subnets (for High Availability purposes, let's ignore the fact that we're only running a single domain controller - don't worry about it) within the VPC that is being used to host the WorkSpaces instances. It is the AD Connector service that provides the magic that will allow us to put an MFA solution into practice: configuration of a RADIUS server to be used as the second factor for authentication can be undertaken here.

In the context of the MFA flow for WorkSpaces, the public-facing brokering service that Amazon provides to enable connectivity over PCoIP to the Workspaces instance orchestrates the authentication flow such that initial authentication occurs with username and password against AD, with the username and provided MFA token being submitted to the RADIUS server in the event that the inital authentication is successful.

So what do we need?

If we assume that we don't have any authentication infrastructure outside of what we're putting in place for WorkSpaces, aside from the AD domain to be used as the first factor, we need:

  • A RADIUS server
  • A means to generate one-time passwords
  • Management infrastructure for the one-time passwords

Things of note

If your organisation's MFA approach is premised upon TOTP, you're going to be doing something non-compliant with that approach for WorkSpaces: Amazon (at time of writing) explicity - in a footnote in an FAQ - warns its users that TOTP is not supported. The only standardised OTP process that remains, then, is HOTP - an algorithm that relies upon a counter, so will probably cause you operational issues with users who may generate tokens without using them.

It is also necessary (for fairly common-sense reasons) that your user store called out to from RADIUS has the exact same usernames as those present in your Active Directory, and case sensitivity can be an issue here.

An open-source product selection for RADIUS and OTP generation

Much as my first inclination will be to prefer open source solutions, the first port of call for the RADIUS service to support WorkSpaces was Microsoft's Network Policy and Access Services: a quick answer was preferable to a philosophically satifying one. It turns out, however, that there is no OTP verification service available for integration with NPS that didn't cost money. Given the speed needed to get an MFA solution bottomed out, working through a corporate commercial process didn't seem that appealing.

The aptly named FreeRADIUS seemed a sensible choice for the RADIUS server, given its flexibility and support for a number of means of authorisation - in case no OTP mechanism would readily become available. After a little bit of digging, I found a PHP script/class called multiotp that seemed to offer the functionality that I required (HOTP), as well as being able to integrate with user lists from an AD server.

A VPC design for MFA-supporting WorkSpaces

Amazon VPC for MFA

MultiOTP configuration

While multiotp does offer a MySQL backend (which could be made readily resilient using AWS's Relational Database Service, this has been omitted here for the sake of simplicity: the flat file backend should be sufficient as long as we are sensible about syncing it between the two RADIUS servers. In order to set up a user, we can issue the command:

multiotp -create -no-prefix-pin [username] hotp [hexademical-encoded-seed] 1111

OpenSSL's rand command can be used with its -hex switch to generate a random hex string encoded to provide a suitable seed.

Following this, we can generate a QR code for the user to use to register WorkSpaces as an application in their software token generator:

multitop -qrcode [username] [file_path_for_png.png]

multiotp caveats

multiotp maintains its own user database incorporating some OTP-specific AD user synchronisation makes some unfortunate assumptions about the types of OTP you'll want to use: it will import all users with the assumption that they will be using TOTP, rather than the required HOTP here, with the CLI not currently presenting configuration options to change the algorithm associated with a user once set or to change the default algorithm used for AD-synced users.

I expect I'll submit some pull requests against their GitHub project in the fullness of time, but even with the caveats made explicit above, it does seem to be the most competent means of returning a RADIUS-compatible response from an OTP generation algorithm (a number of, in fact). Until Amazon does something more sensible around MFA for WorkSpaces, something that works will suffice when the alternative is nothing.

The seed provided to multiotp needs to be a hexademical string, while software tokens (such as Google Authenticator) will often require a base32-encoded version of the string used to generate the first hex string. The easier way around this is to use the token generation algorithm in the multiotp class and just generate a QR code using the CLI to be distributed to the user.

FreeRADIUS configuration

With a user account to actually test against, we need to configure FreeRADIUS to hand-off authentication requests to multiotp. In order to do this, we create a new module for authentication in /etc/raddb/modules/ called multiotp with the content below:

exec multiotp {  
    wait = yes  
    input_pairs = request  
    output_pairs = reply  
    program = "/path/to/multiotp '%{User-Name}' '%{User-Password}' -request-nt-key -src=%{Packet-Src-IP-Address} -chap-challenge=%{CHAP-Challenge} -chap-password=%{CHAP-Password} -ms-chap-challenge=%{MS-CHAP-Challenge} -ms-chap-response=%{MS-CHAP-Response} -ms-chap2-response=%{MS-CHAP2-Response}"  
    shell_escape = yes  
}

In the file /etc/raddb/sites-enabled/default, the directive multiotp should be added before the first instance of pap in the file, and the first instances of chap and mschap commented out with a hash (#). Additionally, the following should be added prior to the first Auth-Type PAP in the file:

Auth-Type multiotp {  
    multiotp  
}

In /etc/raddb/sites-enabled/inner-tunnel, the first instances of chap and mschap should against be commented out, and the following added prior to the first Auth-Type PAP directive:

Auth-Type multiotp {  
    multiotp  
}

The authorisation policy established above then needs to be enabled in /etc/raddb/policy.conf by adding the following just before the last } in the file:

multiotp.authorize {  
    if (!control:Auth-Type) {  
        update control {  
            Auth-Type := multiotp  
        }  
    }  
}

In /etc/raddb/clients.conf, appropriate client information should be populated, with [RADIUS shared secret] being replaced with a secure shared secret to be used to establish the RADIUS connection between AD Connector and FreeRADIUS:

client 0.0.0.0 {  
    netmask = 0  
    secret = [RADIUS shared secret]  
}

Start the RADIUS server in debug mode with radiusd -X and leave it running: upon setting us the RADIUS server in AD Connector, we'll be able to make sure that things are working as they should here.

AD Connector configuration

In the AWS Console, MFA can be activated through the Update Details menu for directories defined within the WorkSpaces service. Enter the IP address of your RADIUS server and the shared secret defined earlier within the Multi-factor Authentication.

WorkSpaces MFA screen

Upon clicking update, you should see an authentication request from a user of awsfakeuser with the password badpassword. After a few minutes, the RADIUS service will be registered for the WorkSpaces directory. From here, try generating an MFA code for real and signing into a WorkSpace using the WorkSpaces client.

Jun 08, 2014

Flask, Safari and HTTP 206 Partial Media Requests

If this post is useful to you, I'd greatly appreciate you giving me a tip over at PayPal or giving DigitalOcean's hosting services a try - you'll get 10USD's worth of credit for nothing

While working on my Python 3, Flask-based application Binbogami in order that a friend would be able to put their rebirthed podcast online, a test scenario that I hadn't thought to check upon came to light: streaming MP3 in Safari on an iOS device. It turns out that attempting to do this resulted in an error in Safari along the lines of the below:

Safari iOS error

A little more investigation showed that this error was repeated on Safari on OSX. Given the unfortunate trinity of erroneous situations that Binbogami seemed to fall foul of, it seemd that the problem lay with how Safari, or QuickTime as the interface for media streaming under Safari on these platforms, was attempting to fetch the file.

The problem

A cursory DuckDuckGo search led me to find that where Firefox, Chrome, Internet Explorer and Opera all use a standard HTTP GET request for fetching media, even where this media could be considered to be being streamed, Safari's dependency on QuickTime for media playback meant that upon attempting to fetch the file, an initial request for the first two bytes of the file is made to determine its length and other header-type information, using the Range request header, with Range requests consequent to these two bytes being made subsequently.

By default, the method that I was making use of in Flask to serve static files does not issue the HTTP 206 response headers necessary to make this work, as well as not paying any heed to the Range of bytes that are requested in the request headers.

Resolution

While it seemed apparent that implementation of the correct headers in the HTTP response and implementing some sort of custom method to send only the requested bytes within a file would be the way around this, my head was not particularly in the space of implementation. Again, with some internet searching I came across an instructive blog post, that appeared to have a sensible answer. With a little bit of customisation to suit my own particularities:

def send_file_206(path, safe_name):
    range_header = request.headers.get('Range', None)
    if not range_header:
        return send_from_directory(current_app.config["UPLOAD_FOLDER"], safe_name)

    size = os.path.getsize(path)
    byte1, byte2 = 0, None

    m = re.search('(\d+)-(\d*)', range_header)
    g = m.groups()

    if g[0]: byte1 = int(g[0])
    if g[1]: byte2 = int(g[1])

    length = size - byte1
    if byte2 is not None:
        length = byte2 - byte1

    data = None
    with open(path, 'rb') as f:
        f.seek(byte1)
        data = f.read(length)

    rv = Response(data,
        206,
        mimetype=mimetypes.guess_type(path)[0],
        direct_passthrough=True)
    rv.headers.add('Content-Range', 'bytes {0}-{1}/{2}'.format(byte1, byte1 + length - 1, size))
    return rv

A secondary issue

While the above did lead Safari to believe that it could indeed play the files, it would always treat them as "live broadcasts", rather than MP3 files of a finite length. This is due to the way in which QuickTime establishes the length of a file through it's initial requests for a few bytes at the head of a file: if it cannot get the number of bytes that it expects, it ceases trying to issue Range requests and instead issues a request with an Icy-Metadata header, implying that it believes the file to be an IceCast stream (WireShark is a wonderful tool).

The issue in the above code is found in the byte1 + length - 1 statement in the issued Content-Range header: where Safari is requesting two bytes in its first request (so the Range header will look like Range: 0-1) this will evaluate to only sending the 0 + (1 - 0) - 1 = 0th byte - not the 0th and 1st byte as requested. The file still looks like a valid MP3 file, however, so Safari requests the whole file as a stream - therefore leading to the "Live Broadcast" designation.

A simple fix was to add +1 to the length declaration, to make it length = byte2 - byte1 + 1.

Conclusions

It's interesting to see how differently major implemenations of media downloading functionality in mainstream browsers can differ based upon the technology underlying it. In the case of Safari's approach however, it seems somewhat contrary to the major use case of this: most people using the browser to access a media file will be seeking to download, rather than "stream" (in a traditional sense) it.

Safari's approach also has the downside of generating a lot of HTTP requests, which as a systems administrator can cause havok if you're yet to set up your log rotations for your webserver and application server container (Nginx and uWSGI in this case). It hadn't been long enough since I'd seen a high wa% in top.