Posted at 11:32 AM in Automation, Cloud, Deployment, Operations, Screencast, Tools, Tutorial, Video | Permalink | Comments (0) | TrackBack (0)
NetFlix has written a lot about how they are effectively using Amazon Web Services to operate their infrastructure. I've found their development and use of the Chaos Monkey (and has even proposed its vision of a "Simian Army") to be particularly interesting. The basic premise is that all systems fail eventually so the Chaos Monkey is an automated tool that intentionally disrupts the infrastructure on a regular basis by terminating instances and, in general, wreaking havoc. Their philosophy is that you should always look at your environments as "disposable" that will always fail...eventually. In practice, this is a new mindset, but it shouldn't be. It highlights the difference between a Tradtional Operations mindset and a Cloud Operations mindset.
We architect and operate Continuous Delivery systems in the Cloud and help companies migrate their operational infrastructure to the Cloud. We've found that what prevents teams from getting the most benefit from the cloud is a traditional operations mindset.
The traditional operations mindset posits that hardware environments are not ephemeral and are something to nuture and maintain for weeks, months or even years. An informed cloud mindset assumes that since anything and everything will fail - eventually - and that all environments are considered "disposable". I emphasize an informed Cloud mindset because many take the traditional Operations mindset when moving to the Cloud.
Since we do a lot of work with AWS environments in the cloud, we've noticed some interesting antipatterns when working with traditional Development and Operations teams who aren't used to working in the cloud mindset. I've listed some of these antipatterns below.
Environment Lease Time Policies
The traditional operations mindset believes that environment lease times are perpetual. You can spot this on a project that uses the cloud when development and QA lease times are continually extended. The cloud mindset treats all environments as ephemeral. Reasonable lease times on a cloud project could be as many as 14 days or as a few as a couple of hours. There are obviously steady-state run time environments in the cloud, but even these instances should be capable of moving the entire environment to other instances at a moment's notice. In AWS, tools like the Elastic Load Balancer, CloudWatch and Auto-scaling support failover architectures such as this.
Centralized Control
The tradtional operations mindset is all built around control. This is because traditionally, it's the Ops team that's responsible for ensuring the applications are up and always running. Bottom line: their ass is on the line. This means that whenever you request a resource from an Ops teams, such as an virtual environment, database, etc. the request is put into a queue in which you must wait your turn based on the priority and request load of the Ops team. In a cloud operations mindset, control can be more dencentralized in terms of requesting a resource. This is coupled with fully versioned assets. The reason traditional operations teams typically fear decentralized control is because configuration assets are not managed or versioned. When these assets are managed and versioned, it's much easier to allow anyone to request any resource - particulaly in non-production environments because they can be easily re-provisioned or configured at any point. In cloud operations, resources requests can be asynchronous through use of fully automated configuration of environments and other resources.
Lack of Configuration Management
In a traditional operations mindset, configuration is typically hidden on someone's machine, embedded within a tool managed by the Ops team or simply in the head of one or a few people on the Ops team. The reason for this is because the Ops team must control and secure the information - particularly in Staging and Production environments. However, the problem in this approach is that the information is locked away in a few people's heads and it presents a significant process bottleneck slowing down the entire software delivery process.
In a cloud operations mindset, all configuration is managed in a database or configuration files accessible to any tool that interfaces with it on the software team. This doesn't mean that everyone has access to all configuration values in all environments (such as, say, Production), but it does mean that any team member can perform a self-service deployment without going through a separate Operations team.
Golden Images
Golden images are particular insidious because it can seem like you're doing the right thing, but you're not. Having a golden image is better than having nothing at all. A golden image is an antipattern that means that you have a snapshot of an instance/environment at a particular point in time. Some teams might even regularly snapshot their images, which is a good practice. However, the installation and configuration it took to create the image is lost. When employing the golden image antipattern, there's no way that anyone can recreate the environments in the exact same manner every single time. Moreover, the steps it might take is either locked in team member's heads or captured staticially at a particular point in time through documentation. Having documentation to manual configure the environment is definitely better than no documentation, but it signifantly reduces reliability and repeatability of environments. The cloud operations mindset says that all of the steps in creating environments are scripted and versioned in a version-control system. And, any engineer on the team should be capable of recreating these environments by typing a single command, clicking a button or it's headless through a Continuous Integration tool.
This touches on only a few of the antipatterns that occur when applying a traditional Operations mindset to the Cloud. Teams won't realize the myriad benefits when moving to the Cloud until they change their mindset.
Start thinking like the Chaos Monkey and employing a Cloud Operations mindset!
Posted at 10:01 PM in Automation, Cloud, Continuous Integration, Operations | Permalink | Comments (0) | TrackBack (0)
I usually shy away from giving a list of tools that we use because people have their particular tool preferences and are sometimes indignant in considering others. However, I realize it's helpful for people to understand the tool landscape when it comes to Continuous Delivery in the Cloud just so they know where to start looking. After reading my Continuous Integration book, this is often the most common question I get from readers.
I want to say up front that I'm not advocating the use of any of these tools, just that we've used some of the tools or investigated when creating Continuous Delivery systems. I'm sure some of the tools that we use on a daily basis won't make it to this list.
The precise toolset a team may choose to use depends upon numerous factors including project, cost and customer constraints - to name a few. Therefore, I suggest that you focus more on the type of tool and determine which one meets their particular needs for their Continuous Delivery ecosystem. Just because I'm not mentioning a particular tool doesn't mean I'm not using it or that I don't think it's a good tool; these are meant to be illustrative. We tend to focus more on freely-available tools because people can download and use them quickly. There are good reasons to choose commercial tools. As implied before, you don't need to be using all of these tools to get significant benefit from Continuous Delivery. Start small and build it up. I've listed some of the tools in each category for the Java, .NET and Ruby platforms. Since, we lean heavily toward Cloud tools, you'll see that we opt for the SaaS-based tools, when applicable. Let me know if your preferred tool didn't make the list. Ok, there's my disclaimer. On with the list:
Application Containers - JBoss, Tomcat, IIS, Mongrel. NOTE: there are so many app containers, I'm not going to try to list all of them.
Build Tools - Ant, AntContrib, NAnt, MSBuild, Buildr, Gant, Gradle, make, Maven, Rake
Code Review - Crucible
Code Insight - Fisheye
Continuous Integration - Bamboo, Jenkins, AntHill Pro, Go, TeamCity, TFS 2010
Cloud IaaS - AWS EC2, AWS S3 , Windows Azure
Cloud PaaS - Google App Engine, AWS Elastic Beanstalk, Heroku
Database - Hibernate, MySQL, Liquibase, Oracle, PostgreSQL, SQL Server, SimpleDB, SQL Azure, Ant, MongoDB
Database Change Management - dbdeploy, Liquibase
Data Center Configuration Automation - Capistrano, Cobbler, BMC Bladelogic, CFEngine, IBM Tivoli Provisioning Manager, Puppet, Chef, Bcfg2, AWS Cloud Formation, Windows Azure AppFabric NOTE: There are many names and overlap for this tool "category".
Dependency Management - Ivy, Archiva, Nexus, Artifactory, Bundler
Deployment Automation - Java Secure Channel, ControlTier, Altiris, Capistrano, Fabric, Func
Information Sharing - Confluence, Google Apps
Installer - InstallShield, IzPack
Integrated Development Environment (IDE) - Eclipse, IDEA, Visual Studio
Issue Tracking - Greenhopper, JIRA
Multi-Type - rPath
Passwords - PassPack, PasswordSafe
Protected Configuration - ESCAPE, ConfigGen
Project Management - JIRA, Pivotal Tracker, SmartSheet
Provisioning - JEOS, BoxGrinder, CLIP, Eucalyptus, AppLogic
Reporting/Documentation - Doxygen, Grand, GraphViz, JavaDoc, NDoc, SchemaSpy, UmlGraph
Static Analysis - CheckStyle, Clover, Cobertura, FindBugs, FxCop, JavaNCSS, JDepend, PMD, Sonar, Simian
Systems Monitoring - CloudKick, Nagios, Zabbix, Zenoss
Testing - AntUnit, Cucumber, DbUnit, webrat, easyb, Fitnesse, JMeter, JUnit, NBehave, SoapUI, Selenium, RSpec, SauceLabs
Version-Control System - SVN/Subversion, git, Perforce
Posted at 02:00 PM in Agile, Automation, Build, Build Management, Cloud, Code Complexity, Code Coverage, Code Metrics, Continuous Integration, Deployment, Developer Testing , Feedback, Operations, Tools | Permalink | Comments (2) | TrackBack (0)
Technorati Tags: amazon web services, automated build, automated testing, aws, azure, build, build management, cloud, continuous delivery, continuous integration, data center automation, data center automation, deployment, deployment automation, ec2, heroku, ide, mongrel, provision, provisioning, puppet, ruby, simpledb, software, software tools, static analysis, testing, tools, version control, wiki
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of essential characteristics, deployment models, and various service models.
Cloud Definition: National Institute of Standards and Technology (NIST)
There are five key features that the cloud must provide:
One of the nice things about the Public Cloud (through vendors such as Amazon Web Services, Rackspace, etc.) is that market demand doesn't lie and it's much more difficult to obfuscate features. Inside an organization, however, you typically only have one choice - what is given to you by your Operations team. Some Operations teams are afraid of the Cloud and will use a bunch of technical nonsense - either intentionally or through ignorance - about how they're employing a Private Cloud.
Before I cover the Private Cloud myths, reset your understanding of "The Cloud" and think of it more as "Utility Computing". Here are the myths that I hear articulated by, seemingly, intelligent people.
Myth #1: "I'm hosting my instance so we're on on the 'Cloud'"
Nope. Simply hosting software or instances on a remote machine isn't Cloud Computing. It's cool, but it doesn't account for an on-demand, self-service process along with other features. Hosting is a part of Cloud Computing, but it's not the only thing.
Myth #2: "We're using VMWare, Xen, etc. so we're using the Cloud"
Nice try. You probably impressed your non-technical peers, huh? Sadly, probably your technical peers as well. Again, virtualization is a key factor in Cloud Computing - particularly for multi-tenancy; however, it's not the only factor. Read the NIST definition again.
Myth #3: "Most of our provisioning is automated, but, of course, our engineers do need to run the automated scripts that provision the environment"
If you aren't able to launch and terminate instances/software on demand within minutes without assistance, you're not using "The Cloud". If a human is required for any of the provisioning (other than filling out a form/clicking a button, etc.), you don't have a private cloud – as it's not on demand. See There's a big difference between automated and automatic
Myth #4: "We've got a cloud, with the exception of metering; It's too difficult to assess on a continuous basis and we like to keep that information private."
If Non-Operations team members have no idea the costs of virtual instances or the purpose of these instances in real time or near real time, (e.g. "Chargebacks"), it’s not a Cloud.
________________________________________________
So, actually, the Private Cloud isn't a lie, it's just misused so often in our buzzword-laden industry, that it loses its meaning. The distinction is important because organizations won't realize the benefits of using "The Cloud" if they don't have features such as on-demand self service, scaling the number and size of instances up and down - on demand, resource pooling, etc. Are you in an organization that lies about its use of the Cloud?
Posted at 11:37 AM in Business Perspectives, Cloud, Operations | Permalink | Comments (6) | TrackBack (0)
Technorati Tags: amazon web services, aws, cloud, liar, lie, myths, private cloud, public cloud
I’m happy to announce that we’re now offering Elastic Operations. Elastic Operations is a managed service that eliminates all your hardware and replaces it with a reliable, scalable cloud supported by Operations Engineers.
You get self-service provisioning, build, deployment, database administration, issue tracking, and system monitoring -- all managed by our expert engineers at one flat rate per month.
Your flat rate is based on the number of applications you’d like to manage with Elastic Operations. And you can scale these experts up and down on a monthly basis, just like you do with Cloud Computing.
We offer various flat-rate plans for development operations, testing, and production. By utilizing 100% automation and the commodization of hardware via the cloud, we offer drastically reduced prices over traditional operations teams who manage data centers.
What applications would you like to manage better? Sign up for Elastic Operations today, and your applications could be up and running in the cloud tomorrow. Check out the one-minute video on Elastic Operations here.
To get more information on Elastic Operations or Stelligent, send an email to elasticops@stelligent.com
P.S. Use our Cloud ROI Calculator to learn how much you can save when moving to an Automated Operations Cloud.
Posted at 08:28 AM in Automation, Cloud, Continuous Integration, Deployment, News, Operations, Screencast | Permalink | Comments (0) | TrackBack (0)
Technorati Tags: amazon web services, aws, cloud, continuous delivery, continuous integration, elastic operations, on demand, systems automation
A one-minute screencast on using the Cloud ROI Calculator to determine the costs between a Automated Cloud Operations team and a Traditional Operations team.
Posted at 11:44 AM in Automation, Cloud, Continuous Integration, Deployment, Operations, Screencast, Video | Permalink | Comments (0) | TrackBack (0)
Technorati Tags: amazon web services, automation, aws, calculator, cloud, continuous delivery, continuous integration, operations, roi, stelligent
We help - typically large - organizations create one-click software delivery systems so that they can deliver software in a more rapid, reliable and repeatable manner (AKA Continuous Delivery). The only way this works is when Development works with Operations. As has been written elsewhere in this series, this means changing the hearts and minds of people because most organizations are used to working in ‘siloed’ environments. In this entry, I focus on implementation, by describing a real-world case study in which we have brought Continuous Delivery Operations to the Cloud consisting of a team of Systems and Software Engineers.
For years, we’ve helped customers in Continuous Integration and Testing so more of our work was with Developers and Testers. Several years ago, we hired a Sys Admin/Engineer/DBA who was passionate about automation. As a result of this, we began assembling multiple two-person “DevOps” teams consisting of a Software Engineer and a Systems Engineer both of whom being big-picture thinkers and not just “Developers” or “Sys Admins”. These days, we put together these targeted teams of Continuous Delivery and Cloud experts with hands-on experience as Software Engineers and Systems Engineers so that organizations can deliver software as quickly and as often as the business requires.
A couple of years ago we already had a few people in the company who were experimenting with using Cloud infrastructures so we thought this would be a great opportunity in providing cloud-based delivery solutions. In this case study, I cover a project we are currently working on for a large organization. It is a new Java-based web services project so we’ve been able to implement solutions using our recommended software delivery patterns rather than being constrained by legacy tools or decisions. However, as I note, we aren’t without constraints on this project. If I were you, I’d call “bullshit!” on any “case study” in which everything went flawlessly and assume it was an extremely small or a theoretical project in the author’s mind. This is the real deal. Enough said, on to the case study.
Fast Facts
Industry: Healthcare, Public Sector
Profile: The customer is making available to all, free of charge, a series of software specifications and open source software modules that together make up an oncology-extended Electronic Health Record capability.
Key Business Issues: The customer was seeking that all team members are provided “unencumbered” access to infrastructure resources without the usual “request and wait” queued-based procedures present in most organizations
Stakeholders: Over 100 people consisting of Developers, Testers, Analysts, Architects, and Project Management.
Solution: Continuous Delivery Operations in the Cloud
Key Tools/Technologies: Amazon Web Services - AWS (Elastic Computer Cloud (EC2), (Simple Storage Service (S3), Elastic Block Storage (EBS), etc.), Jenkins, JIRA Studio, Ant, Ivy, Tomcat and PostgreSQL
The Business Problem
The customer was used to dealing with long drawn-out processes with Operations teams that lacked agility. They were accustomed to submitting Word documents via email to an Operations teams, attending multiple meetings and getting their environments setup weeks or months later. We were compelled to develop a solution that reduced or eliminated these problems that are all too common in many large organizations (Note: each problem is identified as a letter and number, for example: P1, and referred to later):
Our Team
We put together a four-person team to create a solution for delivering software and managing the internal Systems Operations for this 100+ person project. We also hired a part-time Security expert. The team consists of two Systems Engineers and two Software Engineers focused on Continuous Delivery and the Cloud. One of the Software Engineers is the Solutions Architect/PM for our team.
Our Solution
We began with the end in mind based on the customer’s desire for unencumbered access to resources. To us, “unencumbered” did not mean without controls; it meant providing automated services over queued “request and wait for the Ops guy to fulfill the request” processes. Our approach is that every resource is in the cloud: Software as a Service (SaaS), Platform as a Service (PaaS) or Infrastructure as a Service (IaaS) to reduce operations costs (P10) and increase efficiency. In doing this, effectively all project resources are available on demand in the cloud. We have also automated the software delivery process to Development and Test environments and working on the process of one-click delivery to production. I’ve identified the problem we’re solving - from above - in parentheses (P1, P8, etc.). The solution includes:
Benefits
The benefits are primarily around removing the common bottlenecks from processes so that software can be delivered to users and team members more often. Also, we think our approach to providing on-demand services over queued-based requests increases agility and significantly reduces costs. Here are some of the benefits:
Tools
Here are some of the tools we are using to deliver this solution. Some of the tools were chosen by our team exclusively and some by other stakeholders on the project.
Solutions we're in the process of Implementing
We’re less than a year into the project and have much more work to do. Here are a few projects we’re in the process or will be starting to implement soon:
What we would do Differently
Typically, if we were start a Java-based project and recommend tools around testing, we might choose the following tools for testing, requirements and test management based on the particular need:
However, like most projects there are many stakeholders who have their preferred approach and tools they are familiar in using, the same way our team does. Overall, we are pleased with how things are going so far and the customer is happy with the infrastructure and approach that is in place at this time. I could probably do another case study on dealing with multiple SaaS vendors, but I will leave that for another post.
Summary
There’s much more I could have written about what we’re doing, but I hope this gives you a decent perspective of how we’ve implemented a DevOps philosophy with Continuous Delivery and the Cloud and how this has led our customer to more a service-based, unencumbered and agile environment.
Posted at 10:04 AM in Agile, Automation, Cloud, Continuous Integration, Operations | Permalink | Comments (0) | TrackBack (0)
Technorati Tags: automation, case study, cloud, continuous delivery, continuous integration, devops, stelligent
We've worked in numerous types of organizations, from multi-billion dollar corporations to non-profits and small startups. We've worked with Operations teams, been part of the Operations team and comprised the entire Operations team. We've learned what works and what doesn't work in various types of organizations. What we've found is that while advances in Cloud Computing and automation continue to surge, many Operations teams still operate like they were 20 years ago. There is a better way...a MUCH better way: Continuous Delivery in the Cloud. This means 100 % automation of the delivery process (build, test , deployment and release) and the utilization of Cloud resources. Cloud by itself is just a buzzword. Cloud coupled with 100% automation is where you get huge productivity gains enabling you to delivery software as quickly and as often as the business desires.
While the Cloud ROI Calculator we developed is geared toward how much costs are reduced (for example, over $3 million USD per year for a 20+ application organization), the most considerable value, in my opinion, is when implementing Continuous Delivery with 100% automation in the Cloud, you can release more quickly and more often.
How to Enter Information
There are four items of data entry: Average Engineer Hourly Rate, Number Of Applications In Portfolio, Average Size Of Projects and Average Technical Architecture Complexity. Since all data entry uses sliders, you can use the default values if you want to see how it works and then go back and modify values to see different results based on your organization. A medium sized project is approximately a 25K-100K SLOC code base, small is less than 25K and large is greater than 100K. Keep in mind that it's an average of all of the projects in your organization. The technical architecture complexity considers the number of application servers, number of database tables, configuration and other technical complexity. Again, it's the average technical complexity of all your applications.
Explanation of ROI Results
Once you click the Calculate ROI button, you'll see six rows of information, which are explained below. For each row, you'll see the number/cost when using 100% Automated Operations (what Stelligent provides for our clients) vs. the Traditional Operations team that manually performs and queues human tasks.
Number of hardware instances - For medium size projects and complexity, we assume 10 ephemeral instances per application and 20 fixed instances.
Hardware costs per year - Because of commodization and economies of scales, cloud instances are approximately 1/4 the cost of managing your own data center. Source: A combination of The Economics of the AWS Cloud vs. Owned IT Infrastructure and work performed by Stelligent.
Number of engineers - This is where the real cost savings is: number of engineers required to create and support a 100% automated cloud environment. using 100% automated cloud, you pay 1/3 the costs in terms of the human capital required to create and maintain the infrastructure. Source: Work performed by Stelligent.
Engineering costs per year - The total cost based on the average engineer rate in your organization, the number of applications and whether it's automated cloud or traditional Operations.
Organizational cost per year - The hardware plus the engineering cost per year.
Total savings per year - The amount in savings between automated cloud and traditional Operations per year.
About ROI Calculator Development
We used the Platform as a Service (PaaS) offering, Google App Engine (GAE) to develop the calculator. My friend Andy Glover developed the first release of the calculator using GAE, Groovy, Gaelyk, etc. We've since provide new features. The application is essentially stateless, but we're using BigTable to manage certain configurable values. Because it's a PaaS offering, we don't worry about hosting, uptime, etc. It just works. GAE also provides a comprehensive dashboard that has extensive logging, etc.
Posted at 04:23 PM in Agile, Automation, Cloud, Operations | Permalink | Comments (0) | TrackBack (0)
Technorati Tags: automation, calculator, cloud, continuous delivery, continuous integration, operations, return on investment, roi, savings, stelligent
A brief conversation between a developer and a Systems Engineer who still runs his systems like it was 1995...
Developer: I would like a target environment created for me.
Operations: You need to send an email and we will get back to you in a day or so.
Developer: Ok, sending the email now.
Operations: Thanks for the email. Please send us your requirements including your overall architectural approach.
Developer: Ok, here are our requirements and architecture
Operations: Now, we need to get approval from management.
Operations: Ok, we need to schedule a meeting to go over your requirements
Operations: Now that we've had the meeting, we need to schedule a time to setup the servers and environment. This will take a couple of days.
Developer: So, to get one environment it takes 40 hours of actual time and one week of wait time? I'm going to the Cloud and using a provisioning application so that I can get my environment in minutes instead of weeks!
Posted at 12:51 PM in Automation, Operations, Screencast, Video | Permalink | Comments (0) | TrackBack (0)
Technorati Tags: operations cloud aws amazon stelligent dvops
Amazon Web Services released their Platform as a Service offering on Wednesday, January 19th. I've gotten an opportunity to play with it and I'm quite impressed. I created a seven-minute screencast that takes you through the steps to deploy and configure an application/environment using Elastic Beanstalk. In this screencast, you'll see how easy it was to get a Hudson CI server up and running in an EC2 environment. Furthermore, Elastic Beanstalk provides automatic scaling, monitoring, configuration right 'out of the box'. It's worth checking out.
Posted at 12:30 AM in Agile, Automation, Cloud, Continuous Integration, Operations, Screencast, Tutorial, Video | Permalink | Comments (0) | TrackBack (0)
Technorati Tags: amazon, amazon web services, aws, beanstalk, cloud, continuous delivery, continuous integration, devops, elastic beanstalk, hudson, operations, stelligent