fleet/handbook/company
2022-07-27 17:31:18 -05:00
..
README.md Enclosed HTML comment (#6921) 2022-07-27 17:31:18 -05:00

Company

About Fleet

Fleet Device Management Inc is an open core company that sells subscriptions that offer more features and support for Fleet and osquery, the leading open source endpoint agent.

We are dedicated to

  • 🧑‍🚀 automating IT and security.
  • 💍 reducing the proliferation of agents and growing the adoption of osquery (one agent to rule them all).
  • 🪟 privacy, transparency, and trust through open source software.
  • 👁️ remaining the freshest, simplest source of truth for every kind of computing device and OS.
  • 💻 building a better way to manage computers.

Culture

All remote

Fleet Device Management Inc. is an all-remote company with team members spread across four continents and eight time zones. The broader team of contributors worldwide submits patches, bug reports, troubleshooting tips, improvements, and real-world insights to Fleet's open source code base, documentation, website, and company handbook.

Open source

The majority of the code, documentation, and content we create at Fleet is public and source-available. We strive to be open and transparent in the way we run the business, as much as confidentiality agreements (and time) allow. We perform better with an audience, and our audience performs better with us.

🌈 Values

Fleet's values are a set of five ideals adopted by everyone on the team. They describe the culture we are working together to deliver, inside and outside the company:

  1. 🔴 Empathy
  2. 🟠 Ownership
  3. 🟢 Balance
  4. 🔵 Objectivity
  5. 🟣 Openness

When a new team member joins Fleet, they adopt the values, from day one. This way, even as the company grows, everybody knows what to expect from the people with whom they work. Having a shared mindset keeps us quick and determined.

🔴 Empathy

Empathy leads to better understanding, better communication, and better decisions. Try to understand what people may be going through, so you can help make it better.

  • Think and make customer-first choices.
  • Consider your counterpart.
    • For example: keep in mind customers, contributors, colleagues, the other person in your Zoom meeting, the other folks in a Slack channel, and the people who use software and APIs you build, the people following the processes you design.
    • Ask questions in a way you would want to be asked.
    • Assume others have positive intent.
    • Be kind.
    • Quickly review pending changes when your review is requested.
    • Be punctual.
    • End meetings on time.
  • Role play as a user.
    • Don't be afraid to rely on your imagination to understand.
    • Developers are users too (REST API, fleetctl, docs).
    • Contributor experience matters (but product quality and commitments come first).
    • Bugs cause frustrating experiences and alienate users.
    • Create patches with care (upgrading to new releases of Fleet can be time-consuming for users running self-managed deployments).
    • Confusing error messages make people feel helpless and can fill them with despair.
    • Error messages deserve to be good (spending time on them is worth it).
    • UI help text and labels deserve to be good (it's worth it to spend time on them).
  • Invest in hospitality.
    • "Be a helper." -Mr. Rogers
    • Think and say positive things.
    • Use the #thanks channel to show genuine gratitude for other team members' actions.
    • Talking with users and contributors is time well spent.
    • Embrace the excitement of others (it's contagious).
    • Make small talk at the beginning of meetings.
    • Be generous (go above and beyond; for example, the majority of the features Fleet releases will always be free)
    • Apply customer service principles to all users, even if they never buy Fleet.
    • Treat everyone as our guests.
    • Better humanity.

🟠 Ownership

  • Take responsibility.
    • Think like an owner.
    • Follow through on commitments (actions match your words).
    • Own up to mistakes.
    • Understand why it matters (the goals of the work you are doing).
    • Consider the business impact (fast forward 12 months, consider the total cost of ownership over the eternity of maintenance).
    • Often, you'll need to own processes that won't scale. Not everything should be automated from the start. Your experience with doing things manually will teach us how to scale effectively later.
  • Be responsive.
    • Respond quickly, even if you can't take further action at that exact moment.
    • When you disagree, give your feedback; then agree and commit, or disagree and commit anyway.
    • Favor short calls over long asynchronous back and forth discussions in Slack.
    • Procrastination is a symptom of not knowing what to do next (if you find yourself avoiding reading or responding to a message, schedule a Zoom call with the people you need to figure it out).
  • We win or lose together.
    • Think about the big picture beyond your individual team's goals.
    • Success equals creating value for customers.
    • You're not alone in this (There's a great community of people able and happy to help).
    • Don't be afraid to spend time helping users, customers, and contributors (including colleagues on other teams).
    • Be proactive (ask other contributors how you can help, regardless of who is assigned to what
    • Finish completely before moving to something new (help unblock team members and other contributors to deliver value).
  • Take pride in your work.
    • Be efficient (your time is valuable, your work matters, and your focus is a finite resource).
    • You don't need permission to be thoughtful.
    • Reread anything you write for users.
    • Take your ideas seriously (great ideas come from everyone; write them out and see if they have merit).
    • Think for yourself (from first principles).
    • Use reason (believe in your brain's capacity to evaluate a solution or idea, regardless of its popularity).
    • You are on a hero's journey (motivate yourself intrinsically with self-talk; even boring tasks are more motivating, fun, and effective when you care).
    • Better your results.

🟢 Balance

Between overthinking and rushing, there is a golden mean.

  • Iterate your work.
    • Work in baby steps.
    • Pick low-hanging fruit (deliver value quickly where you can).
    • Think ahead, then make the right decision for now.
    • Look before you leap (when facing a non-trivial problem, get perspective before diving in; there may be a simpler solution).
  • Move quickly.
    • "Everything is in draft."
    • Think fast (balance thoughtfulness and planning with moving quickly).
    • Aim to deliver results daily.
    • Move faster than 90% of the humans you know.
    • Resist gold-plating and avoid bike-shedding.
  • Remember, less is more.
    • Focus on fewer tasks at one time.
    • Go with "boring solutions."
    • Finish what you start, or at least throw it away loudly in case someone else wants it.
    • Keep it simple (prioritize simplicity; people crave mental space in design, collaboration, and most areas of life).
    • Use fewer words (lots of text equals lots of work).
    • As time allows ("I would have written a shorter letter, but I did not have the time." -Blaise Pascal).
  • Make time for self-care.
    • This helps you bring your best self when communicating with others, making decisions, etc.
    • Consider taking a break or going for a walk.
    • Take time off (it is better to have 100% focus for 80% of the time than it is to have 80% focus for 100% of the time).
    • Think about how to organize your day/work hours to best fit your life and maximize your focus.
  • Better focus.

🔵 Objectivity

  • Be curious.
    • Ask great questions & take the time to listen truly.
    • Listen intently to feedback and genuinely try to understand (especially constructive criticism).
    • See failure as a beginning (it is rare to get things right the first time).
    • Question yourself ("Why do I think this?").
  • Underpromise and overdeliver.
    • Quality results often take longer than we anticipate.
    • Be practical about your limits and about what's possible with the time and resources we have.
    • Be thorough (don't settle for "the happy path"; every real-world edge case deserves handling).
  • Prioritize the truth (reality).
    • Be wrong and show your work (it's better to make the right decision than it is to be right).
    • Think "strong opinions, loosely held" (proceed boldly, but change your mind in the face of new evidence)
    • Avoid the sunk cost fallacy (getting attached to something just because you invested time working on it or came up with it).
    • Be fair to competitors ("may the best product win.").
    • Give credit where credit is due; don't show favoritism.
    • Hold facts, over commentary.
  • Speak computer to computers
    • A lucky fix without understanding does more harm than good.
    • When something isn't working, use the scientific method.
    • Especially think like a computer when there is a bug, or when something is slow, or when a customer experiences a problem.
    • Assume it's your fault.
    • Assume nothing else.
  • Better your rigor.

🟣 Openness

  • Anyone can contribute to Fleet.
    • Be outsider-friendly, inclusive, and approachable.
    • Use small words, so readers understand more easily.
    • Prioritize accessible terminology and simple explanations to provide value to the largest possible audience of users.
    • Avoid acronyms and idioms which might not translate.
    • Welcome contributions to your team's work from people inside or outside the company.
    • Get comfortable letting others contribute to your domain.
    • Believe in everyone.
  • Write everything down.
    • Use the "handbook first" strategy.
    • Writing your work down makes it real and allows others to read on their own time (and in their own timezone).
    • Never stop consolidating and deduplicating content (gradually, consistently, bit by bit).
  • Embrace candor.
    • Have "short toes," and don't be afraid of stepping on toes.
    • Don't be afraid to speak up (ask questions, be direct, and interrupt).
    • Give pointed and respectful feedback.
    • Take initiative in trying to improve things (no need to wait for a consensus).
    • Communicate openly (if you think you should send a message to communicate something, send it, but keep comments brief and relevant).
  • Be transparent.
    • Everything we do is "public by default."
    • We build in the open.
    • Declassify with care (easier to overlook confidential info when declassifying vs. when changing something that is already public from the get-go).
    • Open source is forever.
  • Better your collaboration.

Why this way?

Why do we use a wireframe-first approach?

Wireframing (or "drafting," as we often refer to it at Fleet) provides a clear overview of page layout, information architecture, user flow, and functionality. The wireframe-first approach extends beyond what users see on their screens. Wireframe-first is also excellent for drafting APIs, config settings, CLI options, and even business processes.

Here's why we use a wireframe-first approach at Fleet.

  • We create a wireframe for every change we make and favor small, iterative changes to deliver value quickly.
  • We can think through the functionality and user experience more deeply by wireframing before committing any code. As a result, our coding decisions are clearer, and our code is cleaner and easier to maintain.
  • Content hierarchy, messaging, error states, interactions, URLs, API parameters, and API response data are all considered during the wireframing process (often with several rounds of review). This initial quality assurance means engineers can focus on their code and confidently catch any potential edge-cases or issues along the way.
  • Wireframing is accessible to people who understand our users but are not necessarily code-literate. So anyone can contribute a suggestion (at any level of fidelity). At the very least, you'll need a napkin and a pen, although we prefer to use Figma.
  • With Figma, thanks to its powerful component and auto-layout features, we can create high-fidelity wireframes - fast. We can iterate quickly without costing more work and less sunk-cost fallacy.

Why do we use one repo?

At Fleet, we keep everything in one repo. The only exception is when we're working on something confidential since GitHub does not allow confidential issues inside public repos. Here's why:

  • One repo is easier to manage. It has less surface area for keeping content up to date and reduces the risk of things getting lost and forgotten.
  • Our work is more visible and accessible to the community when all project pieces are available in one repo.
  • One repo pools GitHub stars and more accurately reflects Fleets presence.
  • One repo means one set of automations and labels to manage. Resulting in a consistent GitHub experience that is easier to keep organized.

Why organize work in team-based kanban boards?

It's helpful to have a consistent framework for how every team works, plans, and requests things from each other. Fleet's kanban boards are that framework, and they cover three goals:

  1. Intake: Give people from anywhere in the world the ability to request something from a particular team (i.e., add it to their backlog).
  2. Planning: Give the team's manager and other team members a way to plan the next three-week iteration of what the team is working on in a world (the board) where the team has ownership and feels confident making changes.
  3. Shared to-do list: What should I work on next? Who needs help? What important work is blocked? Is that bug fix merged yet? When will it be released? When will that new feature ship? What did I do yesterday?

Why a three-week cadence?

The Fleet product is released every three weeks. By syncing the whole company to this schedule, we can:

  • keep all team members (especially those who aren't directly involved with the core product) aware of the current version of Fleet and when the next release is shipping.
  • align project planning and milestones across all teams, which helps us schedule our content calendar and manage company-wide goals.

Why use agile methodology?

Releasing software iteratively gets changes and improvements into the hands of users faster and generally results in software that works. This makes contributors fitter, happier, and more productive. See the agile manifesto for more information.

Why the emphasis on training?

Investing in people and providing generous, prioritized training, especially up front, helps contributors understand what is going on at Fleet. By making training a prerequisite at Fleet, we can:

  • help team members feel confident in the better decisions they make at work.
  • create a culture of helping others, which results in team members feeling more comfortable even if they arent familiar with the osquery, security, startup, or IT space.

Why not continuously generate REST API reference docs from javadoc-style code comments?

We prefer to generate our REST API reference docs the good old-fashioned way. By hand. Here are a few of the drawbacks that we have experienced when generating docs via tools like Swagger or OpenAPI and some plus ones for doing it by hand with Markdown.

  • Markdown gives us more control over how the docs are compiled, what annotations we can include, and how we present the information to the end-user.
  • Markdown is more accessible. Anyone can edit Fleet's docs directly from our website without needing coding experience.
  • A single Markdown file reduces the amount of surface area to manage that comes from spreading code comments across multiple files throughout the codebase. (see "Why do we use one repo?").
  • Generated docs can become just as outdated as handmade docs, except since they are generated, they become siloed and more difficult to edit.
  • Autogenerated docs are typically hosted on a subdomain. This means we have less control over a user's journey through our website and lose the SEO benefits of self-hosted documentation.
  • Autogenerating docs from code is not always the best way to make sure reference docs accurately reflect the API. Based on our experience from past projects, we've learned that the benefits of generated docs do not outweigh the drawbacks of creating them by hand.
  • As the Fleet REST API, documentation, and tools mature, a more declarative format such as OpenAPI might become the source of truth, but only after investing in a format and processes to make it visible, accessible, and modifiable for all contributors.

Why handbook-first strategy?

The Fleet handbook provides team members with up-to-date information about how to do things in the company. By adopting the handbook-first strategy, we can encourage a culture of self-service and self-learning, which is essential for daily a-synchronous work as part of an all-remote team.

This strategy was inspired by GitLab, which uses it with great effect. Check out this short three-minute video about their take on the handbook-first approach.

Why direct responsibility?

We use the concept of directly responsible individuals (DRIs) to know who is responsible for what. Every group maintains its own dedicated handbook page, which is kept up to date with accurate, current information, including the group's kanban board, Slack channels, and recurring tasks ("rituals").

Why group Slack channels?

Groups are organized around goals. Connecting people with the same goals helps them produce better results by fostering freer communication. While groups sometimes align with the organization chart, some groups consist of people who do not report to the same manager. For example, product groups like #g-agent include engineers, not just the product manager.

Every group at Fleet maintains specific Slack channels, which all group members join and keep unmuted. Everyone else at Fleet is encouraged to mute these channels, using them only as needed. Each channel has a directly responsible individual responsible for keeping up with all new messages, even if they aren't explicitly mentioned (@).

Development groups

Fleet organizes development groups by their goals. These include members from Design, Engineering, and Product.

Goals:

progress (+) guarantee

  • Interface - more, successfully adopted features faster
    • (+) keep UI & API simple, minimalist, consistent, and bug-free
  • Platform - improve the productivity of the Interface team through patterns and infrastructure for implementing new features, reduce REST API latency, increase max load test size, make upgrading seamless for users, improve accuracy and reliability of data
    • (+) maintain quick time-til-merge timeframe for PRs reviewed, and maintain clean, empathetic interfaces that allow contributors in other groups to execute quickly and without the need to wait for review or approvals
  • Agent: grow # open source, osquery-based agents by making Fleets agents better, faster, and broader in capabilities
    • (+) every table works intuitively with user-friendly docs and empathetic caveats, warnings, and error messages

At Fleet, groups define the relevant sections of the engineering org chart. A product manager (PM) represents each group and reports to the Product department (or a founder serving as an interim product manager):

  • Interface PM: Noah Talerman
  • Platform PM: Mo Zhu
  • Agent PM: Mo Zhu

Each group is associated with an engineering manager (EM), who, with their group of engineers, form the engineering members of the group.

Each group's PM works closely with engineers within their group:

  • The PM prioritizes work and defines what to iteratively build and release next within their group's domain to best serve the group's goals and the company's goals as a whole. The PM communicates why this work is prioritized and works with engineering to come up with the best possible how.
  • The PM is responsible for epics. Sometimes initiatives require multiple issues that may, or may not, include multiple development groups. These initiatives are tracked as GitHub issues with the "epic" label. One PM is assigned to the epic to make sure that all issues associated with the epic (child issues) make it into a release. It's the PM's responsibility to track progress and close the epic when the initiative is complete.
  • The EM (along with engineers) defines how to implement that definition within the surface area of code, processes, and rituals owned by their group while serving their groups goals and the company's goals as a whole.

Interface group

Responsibilities

  • Everything related to Fleet's graphical user interface (other than for the desktop application portion of Fleet Desktop)
  • fleetctl (the Fleet command-line interface) and the associated YAML documents (almost everything in fleetctl besides the fleetctl debug subcommands)
  • The REST API that serves these
  • The UX/developer experience, flow, steps, and associated UI and API interfaces for how integrations that require any user interaction or configuration (e.g., GeoIP, Zendesk, Jira), including which third-party integrations are supported and which API styles and versions are chosen
  • End to end testing of the application (e.g., Cypress)
  • The REST API documentation
  • Future officially-supported wrapper SDKs, such as the Postman collection or, e.g., a Python SDK
  • Fleet's configuration surface, including
    • The config settings that exist for Fleet deployments and how they're configured
    • How feature flags are used
    • Their default values, supported data types, and error messages
    • Associated documentation on fleetdm.com
    • The UX of upgrading and downgrading and sidegrading Fleet tiers, and managing license keys

Consumers

  • A human using Fleet's graphical user interface
  • A human who is writing code that integrates Fleet's REST API
  • A human reading Fleet's REST API docs
  • A human using fleetctl, Fleet's Postman collection or Fleet's other future SDK wrappers

These humans might be working within the "Interface" group itself insofar as they consume the Fleet REST API. Or they might be a contributor to the Fleet community. Or one of Fleet's core users or customers, usually in an SRE, IT, or security role in an organization.

Goals

  • Bring value to Fleet users by delivering new features and iterations of existing features.
  • Increase adoption and stickiness of features.
  • Keep the graphical user interface, REST API, fleetctl, and SDKs like Postman reliable, minimal, consistent, and easy to use.
  • Promote stability of the API, introducing breaking changes only through the documented API versioning strategy or at major version releases.
  • Ensure observance of semantic versioning for the Fleet API and config between releases so that only major versions include breaking changes.
  • Delight users of Fleet's API, UI, SDKs, and documentation with a simple, secure, widely-adopted user and developer experience.
  • Improve Fleets feature value and ease of use.

Platform group

Responsibilities

  • The implementation of Fleet Agent API: i.e.,
    • The API used by agents on enrolled hosts to communicate with Fleet
    • The API used by agents for doing auto-updates and future installation/upgrade of custom extensions (e.g., TUF)
  • Everything related to providing a stable, simple-to-build-on platform for Fleet contributors to use when implementing changes to the REST API
    • APIs for storing, retrieving, and modifying device data
    • APIs for running asynchronous and scheduled tasks
    • APIs for communicating with external services (e.g., HTTP, SMTP)
  • The challenges of scale
    • Sometimes taking over development/improvement of features from the “Interface” group when these features have unexpected backend complexity or scaling challenges.
  • Production infrastructure, including
    • Fleet Cloud demo
    • Fleet Cloud prod
    • The registry (TUF) used for auto-updates and (in the future) extensions
    • whatever backend is needed to generate installers in self-managed and hosted Fleet deployments
    • The future monitoring and 24/7/365 enhancement to the on-call rotation necessary for Fleet Cloud
  • Behind-the-scenes integrations
    • i.e., integrations that make Fleet "just work" and don't involve configuration from users
    • Example: the code that fetches and manages CVE data from NVD and other behind-the-scenes infrastructure that enables vulnerability management to exist in Fleet without requiring any interactions or configuration from users
  • The CI/CD pipeline
  • loadtest.fleetdm.com
  • dogfood.fleetdm.com

Consumers

  • A contributor from inside or outside the company.
  • A host enrolled in Fleet running an osquery-based agent.
  • The person who deploys, upgrades, and operates Fleet.
  • A person who uses Fleet and expects it to be fast, reliable, and joyful.

Goals

  • Reduce REST API latency
  • Increase max load test size
  • Reduce infrastructure costs for Fleet deployments.
  • Make upgrading Fleet versions seamless for users.
  • Improve the integrity of data (both collected directly from agents or derived from that data, e.g., vulnerabilities).
  • Maintain quick time-til-merge timeframe for PRs reviewed.
  • Maintain clean, empathetic interfaces that allow contributors in other groups (or from outside the company) to execute quickly and without the need to wait for review or approval.
  • Make Fleet as easy as possible to operate and contribute to

Agent group

Responsibilities

  • osquery core, including the plugin interfaces and its config surface.
  • Orbit
  • Fleet Desktop
  • Any future agent-based software built by Fleet.
  • Fleet Agent API: the API interface and contributor docs used by osquery-enrolled agents for communicating with Fleet (how to implement the internals is up to the "Platform" group).
  • The extensions/tables are bundled in Orbit/Fleet Desktop, such as mac admins.

Consumers

  • An engineer working on a custom solution (usually built in-house) on top of osquery
  • An SRE, IT operations, or DevOps professional using osquery-based agents in their default AMIs or container images, which deploys and manages osquery-based agents on their laptops, production servers/containers, and other corporate infrastructure
  • An end-user running an osquery-based agent (Fleet Desktop, orbit, or osquery) on their work laptop, who wants their laptop to be stable, performant, and as private as possible
  • An enterprise app owner (software engineer) running osquery-based agents on her app's servers
  • A contributor working on vanilla osquery in osquery/osquery
  • A contributor working on Fleet Desktop or orbit in fleetdm/fleet
  • Fleet itself, consuming the data generated by osquery and any other agent software

Goals

  • Grow mind/market share of open source, osquery-based agents by making Fleets agents better, faster, and broader in capabilities
  • Every table works intuitively with user-friendly docs and empathetic caveats, warnings, and error messages
    • Every table is documented within the Fleet UI and in fleetdm.com/docs, with GOTCHAS, deprecation notices, and the version when the table was added
  • Make Fleets agents easy to operate and contribute to

History

2014: Origins of osquery

In 2014, our CTO Zach Wasserman, together with Mike Arpaia and the rest of their team at Facebook, created an open source project called osquery.

2016: Origins of Fleet v1.0

A few years later, Zach, Mike Arpaia, and Jason Meller founded Kolide and created Fleet: an open source platform that made it easier and more productive to use osquery in an enterprise setting.

2019: The growing community

When Kolide's attention shifted away from Fleet, and towards their separate, user-focused SaaS offering, the Fleet community took over maintenance of the open source project. After his time at Kolide, Zach continued as lead maintainer of Fleet. He spent 2019 consulting and working with the growing open source community to support and extend the capabilities of the Fleet platform.

2020: Fleet was incorporated

Zach partnered with our CEO, Mike McNeil, to found a new, independent company: Fleet Device Management Inc. In November 2020, we announced the transition and kicked off the logistics of moving the GitHub repository.

Levels of confidentiality

  • Public (share with anyone, anywhere in the world)
  • Confidential (share only with team members who've signed an NDA, consulting agreement, or employment agreement)
  • Classified (share only with founders of Fleet, peepops, and/or the people involved. e.g., US social security numbers during hiring)

Email relays

There are several special email addresses that automatically relay messages to the appropriate people at Fleet. Each email address meets a minimum response time ("Min RT"), expressed in business hours/days, and has a dedicated, directly responsible individual (DRI) who is responsible for reading and replying to emails sent to that address. You can see a list of those email addresses in "Contacting Fleet" (private Google doc).

Tools we use

There are a number of tools that are used throughout Fleet. Some of these tools are used company-wide, while others are department-specific. You can see a list of those tools in "Tools we use" (private Google doc).

GitHub labels

We use special characters to define different types of GitHub labels. By combining labels, we organize and categorize GitHub issues. This reduces the total number of labels required while maintaining an expressive labeling system. For example, instead of a label called platform-dev-backend, we use #platform :dev ~backend.

Special character Label type Rules Examples
# Noun One only #platform, #interface, #agent
: Verb One or more :dev, :research, :design
~ Adjective One or more ~blocked, ~frontend, ~backend
! OKR One only !vuln, !desktop, !upgrade

Rituals

Ritual Frequency Description DRI
Weekly update reminder Weekly On Thursday, Charlie starts a thread in the #help-manage channel and asks managers to reply to the thread with a summary of what their team did in the past week. Charlie Chance
Weekly update Weekly On Friday, Charlie updates the KPIs in the "🌈 Weekly updates" spreadsheet, combines the updates from managers into a single message and adds any hiring announcements. Charlie posts the company update in the #general channel. Charlie Chance

Slack channels

The following Slack channels are maintained by Fleet's founders and executive collaborators:

Slack channel DRI
#g-founders Mike McNeil
#help-mission-control Charlie Chance
#help-okrs Mike McNeil
#help-manage Mike McNeil
#news-fundraising Mike McNeil
#help-open-core-ventures Mike McNeil
#general N/A (announce something company-wide)
#thanks N/A (say thank you)
#random N/A (be random)