OpenSolaris
Collectives
Discussions
Documentation
Download
Source Browser
Free CD
Log-in
|
en
Community Group on
:
Developing Solaris
Top Menu
Show
:
Comments
Attachments
History
Information
Print
:
Print
Print preview
Export as PDF
Export as RTF
Export as HTML
Export as XAR
Wiki code for
Developing Solaris
Hide Line numbers
1: === So you want to develop Solaris... 2: 3: ---- 4: 5: So you’re new to Sun, or you’re new to Solaris. Maybe you have lots of experience developing mission-critical software, maybe you have none. But if you haven’t already figured it out, we take quality very seriously around here. Developing Solaris is very hard, and it’s very important. This is good news, not bad news ~-- solving easy problems is boring and solving irrelevant problems is, well, irrelevant. But you should be prepared that you will need to push yourself to deliver highest quality software. 6: 7: If you haven’t already discovered it, Solaris ~-- like any large software system ~-- has a complete range of software quality within its many subsystems. 8: 9: ==== Immaculate 10: 11: Some Solaris subsystems are beautiful works of engineering ~-- they are squeaky clean, well-designed and well-crafted. These subsystems are a joy to work in but (and here’s the catch) by virtue of being well-designed and well-implemented, they generally don’t need a whole lot of work. So you’ll get to use them, appreciate them, and be inspired by them ~-- but you probably won’t spend much time modifying them. (And because most of these subsystems have been implemented by engineers who are still with Sun, most of the changes will be done by the original implementor(s) anyway.) 12: 13: ==== Fetid 14: 15: Other Solaris subsystems are cobbled-together piles of junk ~-- reeking garbage barges that have been around longer than anyone remembers, floating from one release to the next. These subsystems have little-to-no comments (or what comments they have are clearly wrong), are poorly designed, needlessly complex, badly implemented and virtually undebuggable. There are often parts that work by accident, and unused or little-used parts that simply never worked at all. They manage to survive for one or more of the following reasons: 16: 17: * They work just well enough to not justify the cost of rewriting them 18: * The problem they solve isn’t important enough to justify the cost of rewriting them 19: * The problem they solve is so nasty that the cost of a rewrite is enormous ~-- or at least that it dwarfs the cost of ongoing maintenance 20: 21: If you find yourself having to do work in one of these subsystems, you must exercise //extreme// caution: you will need to write as many test cases as you can think of to beat the snot out of your modification, and you will need to perform extensive self-review. You can try asking around for assistance, but you’ll quickly discover that no one is around who understands the subsystem. Your code reviewers probably won’t be able to help much either ~-- maybe you’ll find one or two people that have had the same misfortune that you find yourself experiencing, but it’s more likely that you will have to explain most aspects of the subsystem to your reviewers. You may discover as you work in the subsystem that maintaining it is simply untenable ~-- and it may be time to consider rewriting the subsystem from scratch. (After all, most of the subsystems that are in the first category replaced subsystems that were in the second.) One should not come to this decision too quickly ~-- rewriting a subsystem from scratch is enormously difficult and time-consuming. Still, don’t rule it out //a priori//. 22: 23: Even if you decide not to rewrite such a subsystem, you should improve it while you’re there in manners that don’t introduce excessive risk. For example, if something took you a while to figure out, don’t hesitate to add a block comment to explain your discoveries. And if it was a pain in the ass to debug, you should add the debugging support that you found lacking. This will make it slightly easier on the next engineer ~-- and it will make it easier on you when you need to debug your own modifications. 24: 25: ==== Grimy 26: 27: Most Solaris subsystems, however, don’t actually fall neatly into either of these categories ~-- they are somewhere in the middle. That is, they have parts that are well thought-out, or design elements that are sound, but they are also littered with implicit intradependencies within the subsystem or implicit interdependencies with other subsystems. They may have debugging support, but perhaps it is incomplete or out of date. Perhaps the subsystem effectively met its original design goals, but it has been extended to solve a new problem in a way that has left it brittle or overly complex. Many of these subsystems have been fixed to the point that they work reliably ~-- but they are delicate and they must be modified with care. 28: 29: The majority of work that you will do on existing code will be to subsystems in this last category. You must be //very cautious// when making changes to these subsystems. Sometimes these subsystems have local experts, but many changes will go beyond their expertise. (After all, part of the problem with these subsystems is that they often weren’t designed to accommodate the kind of change you might want to make.) You //must// extensively test your change to the subsystem. Run your change on your desktop, your laptop, your home machine and every kind of machine you can grab a tip line to. But you can’t just be content with booting a machine with your change ~-- you must beat the hell out of it. Sometimes there is a stress test available that you may run, but this //**is not a substitute for writing your own tests**//. You should find any standards tests that might apply to the subsystem and run them. (If you don’t know which standards tests might apply to your change, consult the gatekeepers or the C-team.) You should review your own changes extensively. Are you obeying all of the locking rules? What //are// the locking rules, anyway? Are you building new dependencies into the subsystem? (This can only be answered with extensive, laborious cscope’ing ~-- you cannot rely on code reviewers to pick up subtle new dependencies.) Review your changes again. Then, print your changes out, take them to a place where you can concentrate, and review them yet again. And when you review your own code, review it not as someone who believes that the code is right, but as someone who is certain that the code is wrong. As you perform your self-review, look for novel angles from which to test your code. Then test and test and test. 30: 31: It can all be summed up by asking yourself one question: have you reviewed and tested your change every way that you know how? **You should not even //contemplate// a putback until your answer to this is an unequivocal YES.**. Remember: you are //always// empowered as an engineer to take more time to test your work. That’s how important the quality of your work is to the company, and that’s what makes Sun a great place to work: our engineers are always empowered to do the Right Thing. 32: 33: You should assume that once you putback your change, the rest of the world will be running your code //in production//. More specifically, if you happen to work in MPK17, within three weeks of putback, your change will be running on the building server that //everyone in MPK17// depends on. Should your change cause an outage during the middle of the day, some 750 people will be out of commission for the order of an hour. Conservatively, every such outage costs Sun $30,000 in lost time ~-- and depending on the exact nature of who needed their file system, calendar or mail and for what exactly, it could cost much, much more. 34: 35: If this costs us so much, why do we do it? In short, to avoid the [[Quality Death Spiral>>qual_death_spiral]]. The Quality Death Spiral is much more expensive than a handful of jurassic outages ~-- so it’s worth the risk. But you must do your part by delivering **FCS quality all the time**. 36: 37: Does this mean that you should contemplate ritual suicide if you introduce a serious bug? Of course not ~-- //everyone// who has made enough modifications to delicate, critical subsystems has introduced a change that has induced expensive downtime somewhere. We //know// that this will be so because writing system software is just so damned tricky and hard. Indeed, it is //because// of this truism that you **//must demand of yourself//** that you not integrate a change until you are out of ideas of how to test it. Because you will one day introduce a bug of such subtlety that it will seem that //no one// could have caught it. 38: 39: And what do you do when that awful, black day arrives? Here’s a quick coping manual from those of us who have been there: 40: 41: * Don’t pretend it didn’t happen ~-- you screwed up, but your mother still loves you (unless, of course, her home directory is on jurassic) 42: * Don’t minimize the problem, shrug it off or otherwise make light of it ~-- this is serious business, and your coworkers take it seriously 43: * If someone spent time debugging your bug, thank them 44: * If someone was inconvenienced by your bug, apologize to them 45: * Take responsibility for your bug ~-- don’t bother to blame other subsystems, the inherent complexity of Solaris, your code reviewers, your testers, PIT, etc. 46: * If it was caught internally, be thankful that a customer didn’t see it 47: 48: But most importantly, you must ask yourself: **what could I have done differently?** If you honestly don’t know, ask a fellow engineer to help you. We’ve all been there, and we want to make sure that you are able to learn from it. Once you have an answer, take solace in it; no matter how bad you feel for having introduced a problem, you can know that the experience has improved you as an engineer ~-- and that’s the most anyone can ask for.
Search
Collectives
Community Group
Academic and Research
Accessibility
Advocacy
Appliances
Approachability
Architecture Process and Tools
BrandZ
Chinese Users
Community Advisory Board
Databases
Desktop
Device Drivers
Distribution
Documentation
DTrace
Emerging Platforms
Fault Management
Games on OpenSolaris
HA Clusters
HPC Developer
Installation and Packaging
Internationalization and Localization
Laptop
Logical Domains
Modular Debugger (MDB)
Networking
NFS
Observability
OpenSolaris Governing Board (OGB)
OpenSolaris Printing
OS/Net (ON)
Performance
Power Management
PowerPC
Security
Service Management Facility (smf(5))
Software Porters
Solaris Volume Manager
Storage
Systems Administration Community Group
Testing
Tools Home
Unix File Systems (UFS)
Website Community
X Window System
Xen
ZFS
Zones
Project
ADSL Modem Enhancement
ARC Process Definition
ARM Platform Port
Automatic Data Migration
BIND Update
Bluetooth Stack & Drivers
Brocade FC HBA - Initiator
Brocade FC HBA - Target
Brussels - unified network link configuration
Caiman, Solaris Install Revisited
Celeste
Český portál
Chime Visualization Tool for DTrace
CIFS client for Solaris
CIFS Server
Clearview: Network Interface Coherence
Cluster Agent: Informix Dynamic Server
Cluster Agent: OpenSolaris Container
Cluster Agent: OpenSolaris xVM
Cluster Agent: Oracle E-Business Suite
Cluster agent: PostgreSQL
Cluster Agent: Samba
Cluster Agent: Tomcat
CMT
Coarse Data Flow Parallelism
Colorado: Open HA Cluster on OpenSolaris
Command Assistant
Common Array Manager
Companion - /opt/sfw: Free and Open Source software
COMSTAR: Common Multiprotocol SCSI Target
Content
Contest
CPU Observability
Credentials Process Groups
Crossbow: Network Virtualization and Resource Control
Crypto KMS Agent Toolkit
Cryptographic Framework
Data Migration Manager
Data Tethers
Deutsches Portal
Device Detection Tool
Device Driver Utility
Device Manager
Device Mapper
Direct Rendering Infrastructure & 3D drivers
DTrace Guide
Duckwater: Simplified name services management
Easy Tools
Emancipation
Emulex Fibre Channel Device Driver
Emulex Advanced Ethernet Device Driver
Enable/Enhance Solaris support for Intel Platform
Enhance the support of USB webcams
Enhanced SMF Profiles
Enhancements for AMD-based Platforms
Erlang DTrace Integration
Ethernet bridge module for Solaris
Evaluate Conary
Events Registry
Ext3 file system support
F/OSS Package Base
Facilitation
Fibre Channel over Ethernet
Fine Grained Access Policy (FGAP)
Fingerprint Authentication
Flexible Mandatory Access Control
Forensic Tools
Fully Open X Project
Fuse on Solaris
gcore
Generic Machine Check Architecture Improvements
Google SOC
HA-JBoss
HA-MySQL
Hadoop Live CD
Hitachi
HoneyComb Fixed Content Storage
HPC Stack
Image Packaging System
Improved Performance MIB
Indiana
Innovation Awards
Input Method
Intel Graphics
Interrupt Resource Management
IP Datapath Refactoring
IP over Infiniband
IPsec Tunnel Reform
iSCSI Extensions for Remote DMA (iSER)
iSNS Server
JeOS - Just enough Operating System
JKstat - a java binding for libkstat
Journaled File System (JFS)
K Desktop Environment
Kerberos
Kernel Sockets
Kernel SSL Enhancements
Key Management Framework
Korn Shell 93 integration/migration project
Labeled IPsec
LatencyTOP
Layer 2 Filtering
LDoms Manager
Lending
libMicro - portable microbenchmarks
Link Layer Discovery
Live Media: Technologies for distributions running from CD and other media
Locale Data
lofi compression and cryptography support
lx64 brand
Media Management System
Mega_sas
Mexico
MilaX minimal Live Distribution
MIPS Platform Port
Mozilla DTrace
MRSL.NONsharedDevice
Multi-lingual Glossary
Multi-pathing software (MPxIO)
Multiple disk sector size support
Multiple DOI
Muskoka: An open repository for OpenSolaris technical content
Navigator
Nemo: A Framework for High-Performance Networking
Network Auto-Magic
Network Data Management Protocol
Network MIBs
Network Storage
Network Time Protocol (NTP)
Nevada Globalization
New Design of 4over6 Mechanism Based on OpenSolaris
NFS RDMA transport update and performance analysis
NFS Server in non-Global Zones
NFS version 4.1 pNFS
NFSv4 namespace extensions
Nightingale: Port Songbird to OpenSolaris
NPort ID Virtualization (NPIV)
NUMA
Object Storage Device (OSD) support for Solaris
OHACGE Script Based Plug-in
ON/Nevada (ONNV) Project
Open Development Infrastructure
Open HA Cluster Utilities
Open Sound System
OpenGrok
OpenPegasus CIM Server
OpenRTI
OpenSolaris Busybox
OpenSolaris Desktop
OpenSolaris Hispano
OpenSolaris Security Audit
OpenSolaris support for the QEMU processor emulator: host and guest
PEF: Packet Event Framework
Performance Wrappers
Pkgfactory
Polski Portal
Portail Francophone
Portal Brasil
Portals
Power Management Usability Interfaces
Presto: Automatic Printing Configuration
Printable Many Page Solaris Manuals
Promise SuperTrak RAID HBA Driver
QLogic Converged Network Adapter GLDv3 NIC Driver
Quagga Routing Protocol Suite Integration
RAID Configuration Utility
RBridge (IETF TRILL) support
RDMA Offload Framework
Reno: Login Process Enhancements for Interop
Resource Management
s10brand
SAM/QFS
SCM Migration Project
SCSI RDMA Protocol
SDcard Drivers
Sensor Abstraction Layer
Session Initiation Protocol
SFW
Shell: bourne shell, korn shell, C shell, etc.
Sierra: Intel WiFi Chipsets Support
Simple Panels
SM-HBA Based SAS HBA Management
SMF Documentation
Solaris iSCSI Target
Solaris PowerPC Port
SourceJuicer
Sparks: name service switch/nscd enhancements
Squashfs
Star integration/migration project
Starfish
Starter Kit
Storage Power Management
Sun Security Toolkit
Sun StorageTek Availability Suite
Support for OpenFabrics User Verbs / API on OpenSolaris OS
Support gcc4/GCCfss in Solaris
Suspend/Resume
SVR4 Packaging
Systemz
Tamarack: Removable Media Enhancements in Solaris
Tesla: OpenSolaris Enhanced Power Management
Test Development
Tickless Kernel Architecture
TIPC
Trademarks
Trusted networking interface policy database for Trusted Extensions
Trusted Platform Module support
Use Case
Validated Execution Project
Virtual Console
Virtual Network Machines
Visual Panels
Visualization for HPC
Volo
VRRP: Virtual Router Redundancy Protocol Implementation
VSCAN service
Web Stack
Website
Winchester: Schema mapping and ID mapping for AD Interoperability
Wireless USB Support
Wireless Wide Area Network
X Consolidation
x86 Generic FMA Topology Enumerator
Xen Gate
Xfce: A lightweight desktop environment
ZFS Boot and Install
ZFS on disk encryption support
Zone Manager
Zone Statistics
Русский портал
البوابة العربية
भारतीय पोर्टल
中国门户
日本ポータル
한국 포탈
User Group
Adelaide
Argentina
Arizona
Atlanta
Baltimore-Washington
Bangalore
Bangkok
Bangladesh
Beijing
Bélem
Berlin
Bhimavaram
Bloomington
Campus Ambassadors
Capital Region
Cardiff
Charlotte
Chengdu
Chennai
Chihuahua
Chile
Cleveland
Colombia
Columbus
Connecticut
Cracow
Czech
Dallas/Ft. Worth
Danish
Delaware
Edinburgh
Egypt
Finland
Florida
Front Range
FuZhou
Great Lakes
Greece
Hangzhou
Hawaii
HeFei
Houston
Hyderabad
Indonesia
Irish
Israel
Italian
Jinan
Kabul
Kansas City
Latvia
London
Madurai
Manchester
Mato Grosso
Melbourne
Minas Gerais
Minnesota
Montreal
Moscow
Mumbai
Munich
NEA
Netherlands
New England
New York City
New Zealand
NIT Hamirpur
Noroeste
Oklahoma City
Osnabrück
Peru
Philadelphia
Piaski
Pittsburgh
Porto Alegre
Puget Sound
Pune
Queensland
Research Triangle Park
Romania
Russia
San Antonio
San Diego
San Francisco
São Paulo
Scottish
Serbia
Shanghai
Shenzhen
Silicon Valley
Singapore
Slovak
South African
Southern Connecticut
St. Louis
Sweden
Switzerland
Sydney
Szczecin
Taiwan
Tecum
Thames Valley
Tokyo
Toronto
Trondheim
Tulsa
Turkey
Ukraine
University of Melbourne
Vale do Paraíba
Vancouver
Venezuela
Welsh - Cymru
Wisconsin
Xi'an
Subsites
Code Reviews
Code Repositories
Package Search
Bugster
Bugzilla
Test Machines
Planet
Mailing Lists
Elections & Polls
ARC Case Logs
Source Juicer
Package Factory
User Authentication
Community Group on Pages
CRT
Advocates & Sponsors
Becoming a Sponsor
Becoming a CRT Advocate
Charter
RTI nits to avoid
Sponsor Tasks
Developing Solaris
Quality Death Spiral
Developer's Reference
Introduction
Prerequisites
The Source Tree, part 1
The Source Tree, part 2
Building OpenSolaris
Installing and Testing OpenSolaris
Integration Procedure
Best Practices and Requirements
Glossary
findunref and unreferenced files
ONNV Flag Days, Heads Ups, and Project Integration History
Builds 101-105
Builds 106-110
Builds 111-115
Builds 116-120
Builds 121-125
Builds 126-130
Builds 21-25
Builds 26-30
Builds 31-35
Builds 36-40
Builds 41-45
Builds 46-50
Builds 51-55
Builds 56-60
Builds 61-65
Builds 66-70
Builds 71-75
Builds 76-80
Builds 81-85
Builds 86-90
Builds 91-95
Builds 96-100
Installation (from source) Quickstart
Annotated nightly(1) Mail Example
Currently Known Issues
Putback Logs
Development Process
Schedule
Bourne/Korn Shell Coding Conventions
wx