Pachyderm is a AI/ML workflow platform for data engineers and data scientists to upload data, train data, and compare results.

I was the first design hire at Pachyderm (Series B funded, YC-backed). I spearheaded the designs across the company, leading to its acquisition by HP in 2.5 years.

Team

Me (Design lead IC), a team of 4-5 FE engineers, Engineering Lead, 2 founders

Timeline

My design work started in April 2020, and ended in Jan 2023.

Results

The launch of Hub designs increased the engagement from non-enterprise users by close to 400%, but the Hub project was eventually ended due to the cost of maintaining it. We had rather positive feedback on the launch of Console platform based on the feedback gathered by the sales team. The company was eventually acquired by HP after my 2.5 years there.

My Role

Led UX/UI designs across all products.

Built the first global design system from ground up and established a consistent design language across products.

Led product strategy discussions directly with founders and engineering lead.

Conducted competitive analysis, market / design research.

Conducted user interviews with sales team.

Supervised design works from other designers.

My job was to be responsible for the designs across the entire product ecosystem, and the first ever project that I designed for was Hub.

Before Hub was launched, there was no consumer facing products. Pachyderm only had enterprise users. Getting Pachyderm started would require 1-2 hours of prep work.

Hub takes care of all the prerequisites and local installs for enterprise & non-enterprise users (ML engineers / scientists) with one click.

Hub platform operates on cloud, so spinning up workspaces for the end users was extremely easy which lowered the barrier to entry especially for non-enterprise data engineers/ ML scientists as well as beginner data engineers / ML scientists.

However, Hub could be expensive to use. Hence a detailed billing page with expenses breaking down into different items was an important user need.

The launch of Hub was significant. The launch of Hub increased the engagement from non-enterprise users by close to 400%.

The designs of Hub was also an opportunity for me to build out Pachyderm’s global design system from ground up.

When I joined, the company had branding assets designed by an external agency. What I had to work with was a a few basic colors, the logos, some illustrations and the typographies that they designed.

Upon examination on competitors, a major theme that emerged is the usage of blues and greens across the competitors. The branding of Pachyderm was to bring out the fun and joy in the serious domain of AI / ML. It is different. However, there are merits to using the blues and greens which are traditionally associated with professionalism and trust. I made some modifications to the color palettes - I introduced a spectrum of blacks to greys, in order to balance out the fun and colorful color palette that Pachyderm originally had.

A snapshot of component library by Eliana

At the end of the Hub designs, I was able to build out the first version of the global design system. Each component has different states and usage. The development team was able to build out a storybook component library based on my design system.

The next important product launched was Console. Console is where ML engineers / scientists train their models: upload data, transform data, compare results.

Majority of the Pachyderm users were using terminal to conduct their workflow. There was a very basic Console platform designed by a developer before I joined. However, it was so hard to use that almost no users at Pachyderm was using the Console platform. My job was to redesign Console and “bring it back to life. “

The Old Console platform by a Pachyderm developer.

Upon UX auditing the old Console platform, there are obvious design goals: 1. Create complete rebranding. 2. Create intuitive user flow. 3. Incorporate core user tasks.

In the latest Console designs, I aimed at maintaining a continuance of brand experience from the Hub platform to Console platform, using a consistent design language.

DAG is a unique feature for AI/ML users. DAG is a graph that consists of repos and pipelines. An input repo, which is at the very beginning of the DAG, is where that data gets ingested. A input repo would connect to a pipeline that contains code specs, which would transform the data set based on the specs. An output repo would be connected to a pipeline, and the transformed data set would be spit out on the output repo. An output repo could become the next input repo, so on and so forth.

Design Hypothesis: the visualization (DAG) is the main value prop for Console moving from CLI to consumer UI.

DAG is a design challenge and it is unlike anything else that I have designed nor encountered before. However, my design intuitive is that the visualization of data transform, which is DAG, would be incredible important for the users, and would be the main value add for both Console and Pachyderm in general.

The first thing I did was conducting a heuristic analysis of the old DAG. Sometimes the DAG could be super complex. The above graph actually contains multiple DAGs, and they are crossing over each other. A few unintuitive issues that I have noticed including:

A pipeline is added on top of a repo - it was unclear to the users that those are separate entities
No directions
Edge crossing
Unintuitive representations of repos and pipelines
No separation from one DAG to another

Some competitors have DAG feature, such as Google Pipeline and Airflow. There are some interesting clues that Google Pipeline gave me: surfacing statuses on repos and pipelines could be important to the users. However, they don’t give me a whole lot beyond that. So what now?

A snapshot of some user generated DAGs that I was able to gather.

Upon some digging in, I learned that the engineering sales team actually had saved some graphs generated by users themselves! This is giving me more confidence on the design directions. A few important learnings from the user generated DAGs:

Directionality of the DAGs
Distinct representations of repos and pipelines
Grid-based layouts
Some graphs are vertical and some graphs are horizontal.

I did a lot of different explorations of the DAGs after the prior research. What happens if the DAGs get super large and complicated? Those user generated DAGs were big, big PDFs that I need to zoom in and out to see those DAG details. So, I made another hypothesis:

Design Hypothesis: considering the amount of nodes a DAG could have, being able to move around the DAG, zoom in/out etc. would be very important (canvas view).

Competitive analysis on products that contain canvases.

There was no product out there put DAGs on canvases. I had no direct references, so I decided to do a cross-industry analysis to draw some parallels. I looked at Figma and Miro etc. on their features and capabilities on canvases.

After synthesizing all the research data and learning, I came up with the V1 designs of the DAG. A few things to note:

Repos and pipelines have representations that are closer to our design language and brand.
Repos and pipelines are separated with their own entity names.
Directionality was added.
Spacial separation of the DAGs
Grid-based designs.
DAGs are designed within a canvas that can be zoomed in/ out and rotated horizontally or vertically.

DAG components were documented and added to the component library (in collaboration with another designer)

Post launch user insights: the DAGs could get really long - how can we further shorten the DAGs for the users?

In response to users’ feedback, I went on a second round of iteration on the DAG design. One fact that I didn’t pay too much attention to before is that the pipelines and their output repos share the same entity names. Therefore, went on a few explorations to leverage this fact. I showcased users my explorations on sales meetings, and Version C was the one that really clicked with the users. They loved the idea of “Combo nodes” — version C is clean and straightforward.

Now I have a direction to work with, but I wanted to make sure that we have explored the best option possible. Therefore, I played around a vertical layout vs. horizontal layout on version C. In order to weigh in the pros and cons between the two, I also mapped out all the potential scenarios on the horizontal vs. vertical layouts. Even though the vertical layout fits longer entity names as opposed to the horizontal layout, users still favor the horizontal version of version C because the clear relationship between pipeline and output repo. The position makes more sense, and the arrow adds directionality to further indicate the relationship between pipeline and output repo.

Another user insight that we gathered post launch is that the visual representations of repos and pipelines aren’t intuitive enough.

Exploration of different DAG Node representations by Eliana

Therefore, I did a few more explorations of different iconographies. These were tested again with users during sales team, and we have our user favorite direction on the right hand side above.

When I put the original DAG design along side with the combo node design, it does shorten the length by a large extent.

In collaboration with a new design hire, we also reaccessed the global design system.

Accessibility pass on the colors (in collaboration with another designer)

Most importantly, we did an accessibility pass on all the colors and typographies. We did user testings on the typography pairing and made some tweaks to this foundation.

Typography paring testing ( (in collaboration with another designer)

The most important takeaway for me is that we don’t always have the perfect condition when designing. There might not be any examples out there for you to draw references. It is paramount to be able to make educated hypothesis, test, experiment, tweak and repeat.

Pachyderm AI/ML Console