Monday, October 5, 2015

New book: The Groovy 2 Tutorial

Today I published the completed "The Groovy 2 Tutorial" through LeanPub: https://leanpub.com/groovytutorial.

It's been a real labour of love as I researched the Groovy language, wrote the text and example code, edited and reviewed my work and even prepared a companion website (http://www.groovy-tutorial.org/). In all it's 443 pages and 64k words.

As Groovy is an open source effort I've made the full text available online under a CC-BY licence. You can also purchase a copy to receive the PDF/ePub version.

Thursday, October 1, 2015

The sum of our component parts

I’ve recently been reading a book that’s been on my to-do stack for a little while: The Machine That Changed the World: The Story of Lean Production. It’s an extremely interesting look at the Toyota approach to manufacturing and often referenced by those describing lean software development and DevOps. One area that really attached itself in my mind was the fact that the act of assembling a vehicle represents 15–20% of the overall effort of constructing a vehicle. There’s a real parallel in how we build software as the business solution component of the codebase is likely to represent a similar ratio to the underlying libraries we depend on to perform a range of tasks and save us time. The model used by Toyota to source and work with component manufacturers was quite interesting and there are handy parallels to software development.

It’s funny how things converge and Episode 63 of the DevOps Cafe Podcast featured Josh Corman discussing efforts such as Rugged Software and I am the Cavalry. With Mary and Tom Poppendieck featuring in Episode 63 there seemed to be a good confluence of thoughts going on (for me, anyway) and I started to think more about the task of reducing waste at the time of component selection.

I would normally undertake investigation of the first-level dependencies but these checks are usually around the activity within the component’s project, frequency of release, status of a CI system and so on. The recent book and podcast inputs started me thinking about tooling that not only helped me discover any issues with dependencies but also across the set of transitive dependencies. Furthermore, such tooling would let me establish an ongoing report that could track components in use against newly determined security issues and bugs as well as determine if there’s an opportunity to rationalise versions or even libraries (I’m looking at you logging frameworks). In a perfect scenario, an issue with a component in production could trigger an alert for developers to review and act upon.

The problem

The problem statement I got to was:

Can I use Gradle to report on dependency licenses and known issues?

I picked Gradle as it’s my go-to build tool and I wanted to focus on existing approaches. I added in the licence aspect because, whilst it’s not a software issue, there’s a chance that one of the dependencies could “infect” the desired software distribution approach and knowing this before release is a handy thing. I’d situate the problem within the Defects category of the seven wastes and removing waste is a key improvement (kaizen) activity - I want to avoid the defect early on but also detect if a component is discovered to be defective at a later time.

Within the problem statement are four key sub-questions for attention:

Q1. What is the mix of licences in the dependencies?

Q2. Does this licence mix impede the desired outcome?

Q3. Are any of the library dependencies known to have security issues and/or other defects?

Q4. Are there known issues (security, bugs etc) with the platform we're running on and the packages we need?

On the licensing side (Q1, Q2) we need two things: a list of the licenses for all of our software dependencies, and knowledge as to which licences we are prepared to accept. From these we could prepare a ruleset that automatically alerts developers when a dependency with an unacceptable licence is included. This would be part of the build and continuous integration reporting and notification configuration.

As Gradle can use Maven dependencies and I’d been involved in preparing packages for Maven Central I knew that the artifacts found there should have licence information within their POM file. Q1 should be reasonably achievable and I decided to put Q2 on the backburner.
On the library vulnerabilities side (Q3) we need:

Q3.1. A list of all dependencies and transitive dependencies

Q3.2. A database of known vulnerabilities

Q3.3. The ability to determine any intersections with the lists from Item 1 and 2

Q3.4. An established process to assess and respond to potential security issues

Gradle’s dependencies task will display a text overview of the dependencies for a build, indicating the data for Q3.1 was possible but I was a bit dubious about existing data for Q3.2 - impacting onto Q3.3. I’ll leave Q3.4 aside for now.

I put Q4 on the backburner as I decided that it reached into other aspects of deployment such as packaging models (e.g. RPM) and provisioning (e.g. Puppet). Issues around these would also be analysed early but using tools outside of Gradle.

So, back to Q3.2 and the OWASP Top 10 for 2013 featured: A9 Using Components with Known Vulnerabilities and tagged against it is the OWASP Dependency Check project. The dependency check tool uses the National Vulnerability Database to source a list of known/reported issues. Things were a bit more optimistic for Q3.3.
Next step is to check if I can get a Gradle plugin to help with Q1 and Q3 and I came up with:

The prototype

It looked like I could try a small prototype. As part of the very basic demonstrator I declare one dependency, Apache Tika, in a Gradle build file. I picked Tika as I know it has a non-trivial set of dependencies.

Aside from Gradle’s Java plugin I also use the project-report plugin as it generates a nice dependency report.

The whole build file looks as follows:

plugins {
    id 'java'
    id 'project-report'
    id "com.github.hierynomus.license" version "0.11.0"
    id "dependency.check" version "0.0.6"
}

repositories {
    jcenter()
}

dependencies {
    compile 'org.apache.tika:tika-parsers:1.10'
}

downloadLicenses {
    includeProjectDependencies = true
    dependencyConfiguration = 'compile'
}
To generate the license report:
./gradlew downloadLicenses
To generate the project reports:
./gradlew htmlDependencyReport
To view the dependencies:
./gradlew dependencies
To create a dependency check report (this takes a while):
./gradlew --info dependencyCheck
I also setup a second Gradle build file (details.gradle) as a small attempt at extracting some details from Gradle:
./gradlew -b details.gradle listRepositoryUrls
./gradlew -b details.gradle listConfigurations
./gradlew -b details.gradle listAllDependencies

The demo code is available in my GitHub account

Conclusion

The license plugin produced two reports: dependency-license and license-dependency in both HTML and XML. It all looked pretty good in terms of solving Q1 and the XML could feed into a small analysis script to raise any concerns (Q2).

The dependency.check reports were interesting and I’d encourage you to generate them for yourself and analyse them with the associated guidance. It’s not perfect but it’s a start so I think Q3 is solvable to a limited extent.

What becomes quite clear is that the licence aspect is reasonably easy provided you can locate the licence and this is somewhat of a one-off. The dependency check is more complex as the reporting of these issues and the associated matching of the issue to the component version is not an ingrained process for many projects. You’re likely to need a range of inquiries to help in analysis:
  • Review of the project vitality and its issue tracker
  • Tracking new releases and changelogs
    • An automated report should be able to tell you if you’re out-of-date
  • In-house security testing (where feasible)
  • Monitoring mailing lists
    • I consider 100% coverage on this almost impossible - maybe just track key dependencies
I’m still intrigued and have started a project in GitHub to look at an analysis tool. I’ll have to see where I can steal a bit of time…

Further reading

Saturday, April 4, 2015

Dead letter offices

Let’s consider printing a map that shows us how to get to a meeting in the city. Once we get to the city and the meeting starts, what’s the utility of that map in terms of the meeting? It may be useful to get me back to my office but you probably don’t want it in the minutes or hand out a copy to everyone.

Okay, it’s not a great metaphor (you’d have used your phone) but you probably get what I’m saying. We often prepare items to achieve “something” but, once achieved, that artefact has no real utility as it has been replaced - probably by code. In an agile approach we should aim to prepare enough documentation (text and diagrams) to help build shared understanding and for us to competently undertake the work requested. Once that feature/story/use case resides in code, it is the code that has become the living rendition of that understanding. Over time, operational fixes and other updates will see the code move further away from the project’s documentation and those words and diagrams start to look more like zombies than the living reality of the code.

I’ve been approaching this issue by linking wiki documents such as use cases to tickets in our job-tracking system. As we develop we can close off the ticket via a commit comment and the status of the job is reflected against the documentation. This lets us see where and when the documentation and code aligned. Storage is cheap so I don’t actually throw the document away - I just let the project continue its flow and the commit message, ticket and document mark a spot in the stream. If that document->ticket->code link didn’t exist then we’d be tempted to think that the documentation and the system are a perfect match, regardless of the passage of time and decisions.

So I’m not advocating that you just toss out those use cases or story maps but know what they represent. Unless you work in an organisation which really does change management to the extreme, the role of all those diagrams and all that text is to help the project team prepare the code that forms the system. Do as little as is useful and no more.

Monday, March 30, 2015

Use cases and user stories

It often seems that the biggest challenge in a software project is determining what the user wants and how that translates into a functional system. Within agile approaches we seek to partner with the users and take the journey with them rather than interrogate them before we code and release to users only when the system is fully operational. A range of tools and processes have been developed over the decades to help understand what the user wants and, in this post I want to look at two such approaches: user stories and use cases. I’ll start with an overview of each and then discuss the possibility of using them together.

User stories

Stories are based in the XP approach and are simple 1–2 sentence work item statements. Something like “List all currently active users”. Sometimes they use a template such as “As a <role> I want to <reach some goal> so that I can <get some benefit>”. The mistake often made is to believe that is it (done! complete!) we know what they want so LET’S CODE!. Actually, stories are a placeholder for a conversation between people. These people need to discuss the story to such a level that it can be turned into code and tests. In some teams you may see this as a straight-to-test/code approach in which the conversation between the developers and the expert user is turned into code immediately and little documentation is done (aside from a bit of whiteboarding). Other teams expect that the users/analysts are busy describing the story away from the development team and, when development on the story is about to start, these descriptions appear and can be acted upon. How these descriptions actually appear is sometimes not really clear. The goal is really to get the user’s needs into code with as few (if any) intermediary steps.

If you really just use one-liner user stories as backlog items then you’re probably facing problems around not knowing the full story and making assumptions. This is especially the case if you don’t have an expert user available to the development team as a full-time knowledgebase. It’d be worth your time checking out this guide by Mike Cohn

Use-cases

Use cases predate the Agile movement (BM - before manifesto). Ivar Jacobson is credited as the father of the use case but he is by no means the only person that worked on how use cases are prepared. I like Alistair Cockburn’s “Writing Effective Use Cases” as it’s a very readable book with solid examples and presents a number of approaches to writing use cases. Use cases seek to capture behavioural requirements for a system - essentially a description of how stakeholders and a system will interact (at the conceptual-, not interface-level). Some authors use text-centric approaches, others use diagram-centric approaches but, at a conceptual level, use cases can be thought of as a user goal combined with the set of “things” worth knowing about the interaction. In particular, we need to know who is involved (the primary actor), what the success scenario looks like and any exceptions that may occur to this scenario (good and bad). A use case may be described broadly in the early days of the project and the details expanded just in time.

Unfortunately, some people have boiled use cases down to that stick figure and ellipse diagram. Much like user stories being boiled down to 1–2 sentences and no conversations, it is an anaemia created by misuse or misunderstanding and not the fault of the approach itself. You’ll also hear arguments that use cases “are not agile” - an odd term often used by charlatans. They argue that use cases gather lots of requirements up front and take us back to “waterfall” approaches. But that’s an argument against writing complete use cases up front and not against writing use cases.

Will it blend?

A brief search of the internet regarding the combination of use cases and user stories will yield articles such as:

This is confusing. I understand that you shouldn’t mix bleach and ammonia as it’ll really start to cause physical pain. But why can’t we mix two methods? Do we create a singularity? No. What we should do as effective practitioners is read a variety of opinions, evaluate them and, if the positives are stacking up, we should try them out. Perhaps we’ll do this in a small project or just for a couple of sprints. Once we’ve had a run with it we’ll use our retrospective time to evaluate how it went, maybe add our observations and recommendations in our learning log and decide if it’s worth pursuing. This is us being empirical.

What we must always remember is that these structures/methods/prcesses aren’t a universal law. Even the Agile manifesto really boils down to some guys meeting up and negotiating a shared set of values. Granted, these are experienced software guys and the outcome distilled critical learning from their careers so it’s worth considering their ideas. Alistair Cockburn’s Oath of non-allegiance reminds us that we should consider options from a variety of sources and the manifesto’s priority of “Individuals and interaction” really guides us to respond to the people around us rather than just plough on ahead with a methodology or tool that isn’t optimal. You should be able to adapt to your surroundings and empirical evidence is your shield.

Mix it up

In Education circles, Constructivist theory provides a notion called the “zone of proximal development” (ZPD). Described by social constructivists (those that view development as influenced by those around us rather than just internal), the ZPD indicates what a learner can do with the aid of an instructor but not by themselves. When guiding a learner we establish scaffolds to guide them through new learning. This sort of thing should happen in effective teams and organisations - senior people help more junior colleagues in advancing by mentoring them at key points in their development (in a wise organisation we can sometimes see the junior bring new ideas and questions that prompt those more senior to learn). A new team member may not know much about testing so we could send them on a course, have them pair with an experienced tester, set them small challenges and review their solutions to help guide them to more effective ones. By just throwing new work at the team member with no attention to their skills, the gap to the ZPD may be too great and they will quickly become demotivated.

So what’s this got to do with my discussion about user stories and use cases? I’m glad you asked! Say your team is using user stories but you keep finding that the user feedback is that the system isn’t really meeting their expectation of a certain business process. Alternatively, consider that you have noticed that the team isn’t handling exception conditions very well. On reflection you may determine that some more structure is needed when discussing a story. Perhaps the team or the client isn’t delving into the story enough or perhaps key additional stories aren’t surfacing.

This presents a learning opportunity. By bringing in more formal tools you can help scaffold the thinking process around user stories so that the correct information is being collected and responded to. You may choose to bring in use case templates to this end. As with all scaffolding, once the building is done you can reduce its resolution and, eventually, remove it. The hope is that team members and users have absorbed the learning and now perform the action as a matter of course. At certain times, such as complex stories, they can put up a bit of scaffolding again to help them through the problem.

In cases where the client/users are looking at 300 user stories and are struggling to work out how they interact you might look at the use case concept of goals as a method of drawing out the top-level value items. You may even flip the whole thing and start with the coarse use cases that generate stories. These help guide everyone in learning where things provide value and we often change tack in the ZPD if the learner doesn’t appear to be picking up what they need - we don’t just keep pointing at the same thing and start talking louder in the hope they work it out.

People may argue that 1 use case = many user stories and that a use case may be too big for one sprint. There may be a debate about a fear of increasing levels of documentation. Have these debates. These are reasonable debates (when informed and not dogmatic) and the debate is good but outcomes are better. What matters here is that you are using the tools that best help you to communicate with stakeholders and to deliver working software you can be proud of.

Further reading

Thursday, March 26, 2015

Standing up against stand ups

In the first post following on from That’s not agile I want to look at a practice I find to be one of the most tricky and contentious issues I encounter when discussing Agile, especially with developers: the daily stand up. Many Agile people tell me that these daily sessions are a MUST and you’re not Agile if you’re not doing them. However, these sessions aren’t described in all agile methodologies and, in many cases, you can watch these meetings as if you were an anthropologist and see where the project is not actually agile at all.

Many agile teams use stand ups (or daily scrums) to ensure that the team is working through the backlog/tasks/stories effectively. Take a look at the list below and see if any/all of these match what you see/experience:
  1. The stand up goes for longer than 15–20 minutes
  2. The project manager/scrum master/loudest person does all the talking
  3. Most people are sitting down
  4. People have their laptops, tablets and mobiles out and are reading off them
  5. It’s rare that the whole team is there at every stand up
  6. Someone is pointing at a gantt chart
Most stand ups I’ve been involved in usually hit 3–4 of those items. In most situations I can tell that the team is needing (crying out for) better communication levels so Item 1 is most common. Of course, if the project team is communicating fluently throughout the day, these daily meetings may not be required at all. In fact, the somewhat artificial nature of the approach can lead to breaks in thought and communication that’s already going on within the team.

Some project managers have indicated to me that it’s a useful approach if the team is distributed and/or some members are working across projects. In both of these situations I wonder if the structure is actually debilitating and that, instead of stand ups, the project would be better served by bringing the team into the same room and ensuring team members are focussed on one project at a time. The context switching hits in numerous ways - it’s an attention break and sometimes a complete change in direction (especially for the project part-timer).

Another reason often cited is that developers don’t communicate very well. First up, it’s important to check that assertion before reacting to it as many developers will sit right next to each other and instant message throughout the day. If your criteria for communication is talking out loud perhaps you need to dig a bit deeper. Crystal Clear stresses the utility of osmotic communication and many of us will have seen projects that just hum - people talking, sharing, laughing. Team members are actively problem-solving with each other, grabbing a coffee and discussing a piece of the architecture or drawing all over the whiteboard. Those teams using XP are pair programming and those that aren’t XP’ers may pair program ad-hoc when a tricky problem has come up. Throughout the day there are waves of quiet as people focus and louder times when discussions need to happen.

My big question here is: when the team is humming, do you really need a stand-up?

If the team is in the same location and someone completes a story couldn’t they just call out “I’m done with Story X and reckon I’ll pick up Story Y” or if they have a problem is calling out “Story W is killing me - the message queue looks like it’s not <blah> - anyone able to help me for a bit?” not effective? Do they really need a stand up? Doesn’t the PM/scrum master/lead hear this because they’re in the same room anyway?

I believe the answer is that you don’t need these daily meeting but… in the early stage of a project or with a new team member it might be worth using stand ups to try and kick-start the communication. As always, monitor, reflect and respond to the situation - don’t just do something because you’ve been told “that’s agile”.

Wednesday, March 25, 2015

That's not agile

I often hear “I do agile” and “That’s not agile” and I’m starting to put it into the same bucket as “That’s not an enterprise approach”. Such terms are often used to reassure those that have just heard that “Agile is good” or as a pre-emptive strike against those asking where the documentation is. I’m thinking of a few posts around items that are often labelled as “Not Agile” but perhaps really can be. First, though, I want to dig into the term “Agile” a little.

The approaches that were swirling around before Agile was something in which you could be certified included ideas such as rapid prototyping, iterative development, use cases and adaptive processes. It seemed that people were tiring of bloated projects that marched on like budget-destroying zombies. Central to many concerns appears to have been the issue of gathering requirements/needs and turning them into software. Do it as two very large blocks and the requirements phase isn’t based in the reality of implementation. Expecting that the users/stakeholders can describe all requirements/needs/goals completely and comprehensively up-front was to ignore the complexity of what is being built and our ability to grasp the context as a whole.

But let’s go back a little. In August 1970, Dr Winston W Royce started paper in Proceedings, IEEE WESCON with:

I am going to describe my personal views about managing large software developments. I have had various assignments during the past nine years, mostly concerned with the development of software packages for spacecraft mission planning, commanding and post-flight analysis. In these assignments I have experienced different degrees of success with respect to arriving at an operational state, on-time, and within costs. I have become prejudiced by my experiences and I am going to relate some of these prejudices in this presentation.

Royce goes on to describe what many refer to as “the waterfall method” of:

  1. System requirements
  2. Software requirements
  3. Analysis
  4. Program design
  5. Coding
  6. Testing
  7. Operations

Royce doesn’t call this approach “waterfall” in the paper and he points out that it “is risky and invites failure”. He also states that it is “important to involve the customer in a formal way so that he has committed himself at earlier points before final delivery. To give the contractor free rein between requirement definition and operation is inviting trouble.” It is here that the past tells us that we are building software for people and the best input into building software is people. People who can describe what they need and tell us how well we’re going by trying out our work - a collaborative feedback loop.

In The Sciences of the Artificial, Herbert Simon posits:

An artifact can be thought of as a meeting point an “interface” in today’s terms between an “inner” environment, the substance and organization of the artifact itself, and an ’’outer" environment, the surroundings in which it operates. If the inner environment is appropriate to the outer environment, or vice versa, the artifact will serve its intended purpose.

It’s a great book and really gets into the act of design in the construction of artificial artifacts (such as software). Simon’s use of terms such as “satisficing”, “problem space” and “search strategies” give vocabulary items to the act of a project as a journey. It is upon considering software development as an act of design and a project as a journey that I see the Agile approach offering more than the linear, segmented viewpoints.

Agile basics

In February 2001 a small group of software people met in Utah to see if they could coalesce their various thoughts and experiences. This wasn’t an ISO committee or a Government research body - it was some 17 guys with a body of experience in software development. Let’s go back to the values they came up with and re-read their manifesto:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

There are also 12 principles underlying these 4 deceptively simple ideas - go read them. Also, understand that these are the aspects that make up Agile according to those 17 people and, by extension, those that claim they “do Agile”. Now, you might be a Scrum shop, an XP team, Crystal, insane chaos, whatever - it’s important to sit back, reflect and think about how effective the team is in meeting the user’s goals. This is the area in which all four items come together: Is the manner in which we’re working (interacting) with the users (individuals), really ensuring that we all understand what needs to be done (collaboration) to build working software that helps them and are we responding to new ideas and clarifications from our users? The various agile methodologies provide projects/teams with lamposts and signs down the Agile street but I recommend that all teams are best served with checking against their approach/interpretation against the manifesto to make sure they haven’t processed the user out of sight or put processes and tools first.

In many senses I don’t actually see this as an Agile-specific area of thought. If I can boldly argue: most projects commence with the aim of delivering some benefit to some party that (hopefully) has a greater return than the cost of the project or the cost of business as usual. From that starting point we see projects leave dock and either arrive at the next port or go horribly off course (a small number may return to dock when they discover the sea can’t be crossed). It’s a this point I really turn to Alistair Cockburn’s Oath of non-allegiance:

I promise not to exclude from consideration any idea based on its source, but to consider ideas across schools and heritages in order to find the ones that best suit the current situation.

There is a huge spectrum of people that state that their projects are Agile. Some are almost clerical in following what they percieve as the one true methodology. Others are extremely sloppy in terms of process but their team knows their users really well and they just get s**t done. Several are really just running a prioritised task list and use “Agile” as a method to avoid documenting anything, committing to anything or being responsible for anything. These last group really dilute the validity of the ideal and, unfortunately, when their project fails it’s often blamed on the agile approach - next time we’ll really need to micro-manage the team.

It’s important that we sometimes sit back and make sure we’re really being Agile and not just process-gazing. Those that claim to be agile should be checked for that claim and helped where they fall short. Some of my projects aren’t agile and I’m honest when I say we use aspects of agile methodology but I don’t claim the approach meets the agile manifesto. I also try to encourage the movement of those projects back to the agile path. However, I’m often a little perplexed when I use something like a use case template to help guide thinking and get told “use cases aren’t agile”. In further posts I want to explore why I think they (and other tools) are and why people have constructed some sort of checklist of what isn’t agile.

Tuesday, March 24, 2015

The treachery of diagrams

“A picture tells a thousand words” is often chanted by people that don’t want to write 5 words, let alone 1,000. So they draw a diagram and this can be a trap. Don’t get me wrong, I draw lots of diagrams over the course of a project — use cases, flow charts, concept maps, formal notation stuff. What matters most isn’t the diagram or the text — it’s ensuring that the right people share the understanding. We’ve all see huge requirements documents that we know no-one has read or, when development starts, will read.

Whilst I could write up a list of document traps such as the user-dozer[1] or the arbitrary weight/length/template metric I want to look at approaches to diagrams that are used to either flummox non-technical people or to prepare less than the bare minimum documentation needed to deliver a competent output.

For each of the three traps I’ll highlight a praxis point that can help you skirt the trap.

The useless case diagram

A piece of art can convey and draw upon deep concepts such as culture and memory that engage us as a viewer. Technical diagrams are not primarily pieces of art and should exist only to provide a succinct analysis of a design element. The Unified Modelling Language took in Ivar Jacobson’s stick figure and ellipse notation and tool-providers added these to their stencils and many, many people started thinking that these were the entirety of use cases. I stand in the gallery and look at the stick figure labelled “User” pointing at an elipse labelled “Withdraw cash” and it takes me back to having being in Italy on the Feast of the Assumption and having the ATM eat my card. Actually, it just looks meaningless - like I’m not being told how much work lies beneath.

In and of themselves the ellipse diagram is only part of the story and they should be accompanied by a body of text that helps describe aspects of the use case such as main success scenario and possible exceptions. Other diagrams suffer from the same curse - they don’t tell enough of the story.

Praxis: Diagrams and text are outcomes of conversations and are renditions of an understanding, not the understanding itself. Make sure that the diagram and text mean enough to stir agreement and action.

The intimidation notation

I enjoy animated series such as The Simpsons and Family Guy. The use of animation lets them deliver extreme situations that a live action TV series probably could not attempt. However, when Simpsons came out many people seemed to think it was for kids as it’s a cartoon - they maybe hadn’t seen any Anime. In a similar act of misplaced alignment, certain technical diagrams are often rolled out in front of stakeholders - they’re pictures, how hard can it be?

I have seen large, complex diagrams prepared in Business Process Model and Notation (BPMN) put in front of end users as part of seeking “sign off” prior to development (this project was far from agile to begin with and this just took it further away). I have no issue with the BPMN but it’s really a technical syntax that, for the untrained eye, is likely to be no different to showing them assembly code.

By using these documents on stakeholders we’re creating a situation where they may not feel confident enough to say they don’t understand what they’re looking at (very few people like to feel stupid, especially in a group). Actually, if one stakeholder stands up and says “I have no idea what that is and have no ability to give you any further feedback” then you’d probably know who was the stupid person in the room (hint: it’s not the one that stood up). We are working in their domain and we should be either training them up in the technique or tailoring the presentation to meet their model of the world. Did the BPMN actually say more than a text-based table or even a basic flow-chart format or did you use BPMN as that’s what you need to put in the workflow engine?

Praxis: When used for gathering and checking information about user goals, speak in a language that the user understands - diagrams can be just as technical as text and code.

The wall of confound

Working in the 1990’s you’re likely to have walked into an office area and, instead of seeing project backlogs and story cards over the wall you would have seen a huge map of a relational database. Perhaps these were useful when computer screen were smaller. To me, however, they really yelled “This thing is big, really big, beware all who enter and any that seek modifications” and “We printed this so now it’s forever”. Most people in the office are also suffering from ozone poisoning.

Once your diagrams get so big and complex that you need to hire a sherpa as you walk through it, you need a rethink about how to present it (maybe also rethink your architecture). Approaches such as the C4 Model are a good example of an almost Google Earth approach in which we can look from a high level and zoom down into our suburb by selecting layers relevant to the resolution we need. Even better, approaches such as C4 don’t demand that everything be perfectly diagrammed and we might only draw lower-level diagrams for complex components that are currently under analysis.

Praxis: Diagrams and text that are too large and complex just don’t help the discussion. Choose methods for zooming-in to reasonable scale whilst tuning out parts of the system not important to the discussion.

That is not my dog

Sorry, that’s just a reference to one of my favourite comedy sketches. Actually I wanted to close this post by pretending I know about art. RenĂ© Magritte’s “The Treachery of Images” is a favourite of mine as it beautifully reminds us that the image (of a pipe) is not the actual item (a real pipe):

René Magrittes The Treachery of Images - This is not a pipe

When we draw diagrams and write text we must remember that it is the running system that is the pipe.

Reading

Some useful texts worth your time:

  • A. Cockburn, Writing Effective Use Cases, 1 edition. Boston: Addison-Wesley Professional, 2000.
  • Ivar Jacobson, Ian Spence, and Kurt Bittner, USE-CASE 2.0 The Definitive Guide. Ivar Jacobson International SA, 2011.

  1. These are long and/or technical documents aimed at getting the user to agree just so you go away. They flatten the user’s interest completely.  ↩

Friday, January 23, 2015

Artefact Repositories

Working from the same set of components helps to catch problems early

In my last posts I provided an overview of version control and the use of a continuous integration server. This post adds an artefact repository to the development infrastructure. The artefact repository is a versioning store for compiled code components (such as jar files) and helps teams using and managing both 3rd-party and in-house developed components.

Most of my recent experience has been in the Java world and in using Apache Maven to manage builds, dependencies and various other tasks. I came across Maven several years ago and its dependency management feature quickly sold me. At the time I’d joined a team who had development documentation that required I locate a range of library and binary files scattered across the web. As I started to establish my development environment it became apparent that a number of the required components were no longer hosted at their original location and I had to ask team members to send me their copies. This is maddening enough when the team is in the same location but it completely collapses if you are remote, especially if you’re working in an open source community.

At first I wrestled with the idea of storing the files in our Subversion repository but it occurred to me that we’d still need some sort of system/script for bringing these files together and making sure the correct versions were being used. Likewise for setting up an FTP/HTTP accessible file store. After a fair bit of research I decided that Maven really gave me what I was after: the ability to manage in-house and 3rd-party components used by the build.

I’ve enjoyed using Maven but the XML configuration can get tedious and lately I’ve enjoyed working with Gradle. Dependency management in Gradle includes both Maven and Ivy repositories and Gradle has (improving) support for publishing to these repositories.

Once you discover the benefits of an artefact repository in its “pull” model (where you’re primarily grabbing components out of it) you can then really extend the advantages by setting up a local repository for managing your own components. Beyond that you can start distributing your components to central artefact repositories so that they can be easily utilised by other developers.

Apache Maven is a build configuration tool and not an artefact repository. By default, Maven will use the central repository to source dependencies and will cache them in your home directory (~/.m2/repository/). This article really encourages you to look at the benefits of running your own artefact repository.

What are the benefits?

Here are some of the key reasons to use an artefact repository:

They stop you needing to hunt out library files from the web, network storage or your colleague’s USB drive.
Setup documentation that includes “get a copy of CrazyLib.jar from http://libs.example.com” is just asking for trouble. Instead of this you define a dependency in your build configuration and the build tool downloads a copy from the artefact repository.
Harmonise the libraries and their versions across the team
Reduce the “it works on my laptop” and “d’oh we’ve been working on different versions for the past 3 months” by defining the build configuration used across the team.
Real code re-use
To my mind the model of an artefact repository makes code reuse much more achievable than other approaches I’ve encountered. Structuring functionality into components helps improve system architecture and, when done right, means that a project can publish re-usable components based on specific functionality.

The role of the artefact repository

Development infrastructure example
Proxy for other repositories
The Central Repository provides a huge array of components to the Maven community. By setting up a local proxy for this repository you can cut down on network traffic and help speed up builds.
Local store of 3rd-party modules
When you start using a 3rd party module (perhaps a jar or war) that isn’t available in another repository you can upload it to your local repo and make it available to the whole team.
Working with your continuous integration (CI) environment
The CI environment should be charged with publishing SNAPSHOTs to your repository as a post-build step. This means that other (sub-)teams will pull down the newest version of the component if they happen to be using it.
I’ve also found it extremely useful to configure a CI build that performs a clean-room build on a daily basis. This ensures that the CI system doesn’t use its cache of components and downloads everything configured as a dependency. This can be handy in discovering if “lingering” components are generating false positives in the build.
Release hosting
When ready to release you switch from releasing SNAPSHOT (development) components to the production (release) version. Artefact repositories can (and usually do) host both and this demarcates stable and non-stable components.
Repositories such as Sonatype Nexus can also provide interfaces used by package managers such as yum. This allows you to distribute your packages into the same repository as your code components.

Backing up your repository is really important as it will (over time) start to contain components that you may not be able to recover/rebuild in the case of a system failure.

Products

My primary experience has been with Sonatype’s Nexus product - the OSS version specifically. There are a few other repositories worth investigating:

A handy comparison is also available

More than just code modules

At the core of a repository such as Nexus is a filesystem that tracks versions. This gives you a piece of infrastructure that can handle the various components and packages that aren’t well-suited to version control systems such as Git. Here are a few ways in which I’ve utilised an artefact repository:

Host virtual machine images
For this work I created a build environment in Rake that used Packer to generate virtual machine images and store them in Nexus. Team members and the build environment could then use a Vagrant file to grab a copy of the image - ensuring that we had a stable platform for testing and creating demo systems. It gives you “platform defined as code”.
Deployment file store
Using tools such as Puppet has become increasingly prevalent but I notice many people resort to using version control for storing large files that are needed in deployment. Ideally you might look at setting up a local package repository but converting an existing application distribution to RPM can be time-consuming. Artefact repositories provide a system for hosting versioned files and an HTTP-based access point.
Production build proxy
I was recently involved in a project that utilised Python and I noticed that the “release” onto the production server involved the deployment tool dragging a heap of libraries from the Python Package Index (PyPi). I may be a bit old-school but the idea of a production system grabbing libraries from the public Internet makes me rather queasy. My suggestion was (at the very least) to set up a local proxy (such as devpi) to at least divert the egress into something a little more “controllable”. A number of artefact repositories provide services for a variety of package managers - providing a central repo for many uses.

Summary

Over these past 3 posts I’ve covered version control, continuous integration and now artefact repositories. I hope these introductions describe the utility of each piece of development infrastructure.

Thursday, January 15, 2015

Continuous Integration

“It works on my machine” really translates to “I don’t know why it works - I just clicked buttons” and the offender forced to buy lunch for the team
Continuous Integration (CI) focusses on ensuring that a project’s code successfully builds whenever the code base (usually stored in a version control system) changes. The “continuous” aspect relates to the fact that the build is run every time code is checked in (committed). Given that some approaches see each team member commit code several times a day, the CI system may be quite busy. Having worked in a team where one member kept on committing code that broke the build, using a CI approach helped us determine where problems were coming from and saved time misspent thinking that your own code is wrong (svn blame[1]).
In fact you could have a CI process that rolls back a commit that breaks the build.
The CI approach differs somewhat from approaches such as “nightly builds” as it really is continuous and can really help make sure people are only committing code that doesn’t break the build[2]. This constant feedback loop should trigger a fix immediately whilst the code is “top of mind”. An associated work practice is “pull often, commit small, commit often” so that the team is working in-step and the CI process helps capture issues before they get too big[3].
This may sound a little complex but a CI process really has only 3 responsibilities:
  1. Trigger a build when new code is checked into version control[4]
  2. Run the build and its associated unit tests
  3. Report on any failures
Your CI process doesn’t have to be a huge bells and whistles affair. In the most basic case it may be a desktop PC that developers walk over to and manually start a build. The use of a separate system for CI helps reduce (but not eliminate) the false positives that occur when a build succeeds on a developer’s machine because of the miscellaneous debris crawling around developer laptops (old versions, libraries on classpaths that aren’t included in the build config etc).
There are many CI systems around that help you get going with a more automated CI environment:
If you’re after an online service, take a look at Travis - I’ve not used it but it gets good press.

Automating your build

Automating your build is extremely useful in terms of successfully establishing your CI environment but, beyond this, it’s a good candidate for the “best practice” list. In a CI environment an (efficient) automated build is extremely important as the build should be possible without manual intervention. Long build times may indicate that the build needs to be broken up into smaller components or that your tests are verging away from unit tests towards integration testing.

Workflows

In a coding approach where you use branching workflows you might have the CI system “watch” only certain branches. Furthermore, it could be useful to consider a Read Only Master Branch in which individuals/features/ideas/etc have their own branch but a merge to the master branch is tested before being accepted.

Component-based development

A CI server can be very useful in projects that have teams working on separate components. For example, Apache Maven-based projects can have their CI server deploy SNAPSHOT artefacts to an artefact server (such as Sonatype Nexus). This means that other developers with that dependency will have the new component version downloaded the next time they run a build.

Steak knives

As you develop your CI infrastructure you can start exploring a number of toolsets that can further aid the development effort:
  • Software metric tools such as SonarQube can help you hone in on areas of weakness in your code quality (e.g. duplication, dodgy coding practice or a lack of documentation)
    • These are best run less frequently (nightly) as they can be time consuming
  • Generate your documentation such as your javadoc or Maven site on a nightly basis
  • Create a “clean room” build that freshly downloads all dependencies before building - this really helps catch issues such as 3rd-party libraries that just disappear.
  • Deploy an instance of the built service into a virtual machine for user testing, interface testing (e.g. with Selenium or integration tests[5]
    • Also best run less frequently - especially if you’re going to be soaking up a fair bit of system resources.
  • Get Chuck Norris in to make sure people know you’re serious!

Test drive

Most CI systems are quite easy to install and get running - even just on your laptop. I’d suggest that the best first-step is to allocate 3–4 hours to install a CI system (try Jenkins), configure a job for your main codebase, run it and see how it goes. Then, add in Chuck Norris.

  1. This was used as an in-joke - see the SVN Book - but, seriously, the CI server is not a torture device that lets everyone insult a team member.  ↩
  2. In a regular environment I consider the build to be broken when it won’t compile or a unit test fails. Usually a less-frequent build would run other tests (integration, UI etc) but I usually label failures differently (e.g. “broke the deployment”)  ↩
  3. It can be useful to manually trigger the more comprehensive build and test suite once a feature is complete - why wait until tomorrow to see if it’s broken?  ↩
  4. Look at GitHub and BitBucket (Web)hooks for the push-model approach.  ↩
  5. Most of my integration testing experience has been manual or script-based. However, projects such as Citrus Framework look to provide a good basis for easily established integration tests.  ↩