My foot in the door at Hewlett-Packard Company was a step down from an engineering position to a technician role. As I was moving from the field of radiation safety to software development, with no real software development experience, it was a reasonable approach. The real value of this was that it afforded a Genchi Genbutsu approach to performance improvement.

Genchi Genbutsu

I started at HP as a computer integrated manufacturing support technician. HP was growing manufacturing at their Vancouver WA plant, and they were trying to do so with as little added labor as possible. This was before the chase to the lowest cost world labor market rush, and the approach was automation. My role was to support the automation that supported the manufacturing system.

As a technician I was essentially on-call for any manufacturing automation related issue, and when not fire fighting, I had other development and support work. But I made a point of walking the lines at least hourly, and checking in with the line operators.

At this point in time, the top level assembly lines were experiencing as much as 30% downtime. By walking the lines I observed several common occurrences. A majority of the downtime was caused by failure of the final test stations. There were two of these on each line, and they involved a good deal of complexity. In the event of a failure of one station, the line would become backed up, and often that backup would lead to a material flow backup to both stations, bringing the whole line to a standstill.

Another recurring problem was that the process technicians on the line would spend a considerable amount of time isolating sensor faults which would often have the line down for sometimes several hours as they piece meal replaced sensors. One time after watching these folks work for a bit, I went to the line control computer, and searched for the where the line was blocked in the control code. That took me about two minutes. I then walked back to the area where three technicians were working and pointed out which sensor was causing the problem. One of them told me, “Oh, no that can’t be the problem,” and they proceeded to replace sensors till they got the sensor I pointed out. Three hours later, after replacing that sensor, the line was back up and running.

When I asked why no one bothered to look at the code, I was told, 1) because no one can read it, and 2) because every segment of the line was written by a different process engineer, and so was coded completely differently.

These production lines had 10 to 12 PCs in place to operate various tests, run vision systems and perform data collection tasks. Now as assembly lines go, these were pretty clean operations, but there was still a fair amount of dust generated in the assembly process, and these were relatively inexpensive PCs, which as is typical, are cooled by drawing a negative pressure on the case to pull air around the components. Which meant that they pulled dust into the case, clogging air flow around things like the CPU cooler and hard drive. These PCs would be lucky to survive a year before failing. The real problem with this though was that when they failed, a replacement had to be built and specifically configured for that station.

A final major recurring problem was that all too frequently, the line control computer would choke and require restarting. Restarting it was at least a 20 minute, time intensive operation.

Standardized Work

Fast forward to 18 months after I start working at HP, and I move into an engineering role. I was assigned to develop the line control and data collection systems for a set of new production lines. The first thing I did was to ask how much control I had in the design aspects. My then manager told me, in the HP Way of the time, “better to ask forgiveness than permission.” That was enough for me.

The first thing I did was to approach the section manager with oversight of the technical support staff. What I asked for was at least one technician from each shift to be part of a line control coding team. No more coding by process engineers. I arranged for a three day programmable logic control (PLC) training course for this group of folks, then arranged for them to be on day shift for two weeks to develop the line control code. The idea was that at least on person from each shift would be intimately familiar with the line control logic.

Recognizing that this would be a significant undertaking for these folks, I also arranged for a hot shot PLC programmer to work with the line designer to develop a standard base workstation design and control code block, what was known as a standard stopper set. Every station on the line began as a standard stopper set. If additional control was needed for a station, a vision system for example, it would always start with this standard stopper set.

We then brought our programming team on board for day shift for two weeks, and assigned each member a set of work stations. Each station would begin by copying the standard stopper control code, and then get customized for the station. In previous lines, dedicated line control computers were installed in the center of the line, meaning all trouble shooting had to be done by poking at the code, walk down to the station and try something out, walk back to the controller, change a line of code, repeat until you walked 10 miles in a development shift. For this line, we used a portable programming station that could be taken to directly to the work cell so you could see in real time what was happening. We had the line fully operational in under two weeks.

Standard PC config with redundancy

I changed the PC strategy. We used industrial grade PCs with filtered, positive pressure cooling in rack mounts. More significantly, every PC on the line was configured identically. Any PC could be used for any station by running a simple batch file in about 3 minutes. We also had an installed spare. In the event of a failure, we pull out the failed PC, put the installed spare in and run the config batch file.

For the final test stations, we knew we could handle normal capacity with two stations. I fought hard to add a third, backup station. The argument against installing this third station was cost, nearly $100K. But we estimated downtime at around $29K/hr in expenses for people standing around and lost production, so I argued that all we needed to do was save four hours of downtime and we could recover that cost. I won out, we installed three test stations, and rotated among them to have two in operation at any one time.

Results

The net result of these changes on this line were:

  1. We went from a downtime rate of nearly 30% to under 3%.
  2. What had once been a 30 minute manual start-up operation requiring the manual start and monitoring of multiple systems, became a 3-minute automated start-up from breaker close to operational status.
  3. We suffered no downtime as a result of final test. If a failure occurred, the operators simply shifted to the stand-by station, and the failed station could be trouble shot without effecting line throughput.
  4. PCs did fail. The difference is, when a PC failed, it was swapped with the installed spare, the spare was configured for the station with a batch file, and the problem was trouble shot off-line.
  5. Behavior of the technicians changed. Instead of blindly hunting and pecking at problems, the first thing they would do would be to pull the line programming cart to the station and find the control fault. Trouble shooting times dropped to a tenth of prior trouble shooting times, and ownership for operation and improvement of the line went way up.

On the first day of operation of the first line, my bosses’ boss stopped by my desk. I think I’d been there for 12 hours trouble shooting multiple issues as we brought the production system on-line. I remember the heart sinking thought as she sat down on the other side of my desk, wondering, OK, what’d I screw up now. But what she said to me is that she had just been in the end of shift meeting with the line operators, and they said they had never had such a smooth first day on a line start-up.

Moral of this story

If I look at this from the Toyota Way perspective, I employed several principles here. The most significant was Genchi Genbutsu, go and see. By working as a technician in the factory for 18 months, I had first hand experience with what was, and what was not working. I remember being stopped by one line worker, and asked to just stop and watch. There was an anomaly that required operators to wait until a timer expired, before they could complete their work. This was only visible because I was available for an operator to call my attention to the anomaly, and then to stand and watch to see for myself.

We employed standardization as much as possible. Even where a work station needed to be customized, it was based on a standard code set. This allowed rapid diagnosis of faults. Standard PC configurations allowed rapid switch-over and off-line trouble shooting.

We developed the capability of our people. The process engineers were still responsible for the assembly process, but the technicians who would have to deal with the line every day, were the ones who actually coded the control system.

I could probably list several other Toyota Way principles we employed, but my point here is really that we weren’t employing the Toyota Way, we were making process improvements in ways which worked for the system, the people and the culture. In the same way, Toyota didn’t develop their processes and principles because someone else did them, they did them because they provided value.