THE ADVANCED COMPUTING SYSTEMS ASSOCIATION

The following paper was originally published in the Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference

Monterey, California, USA, June 6–11, 1999

pk: A POSIX Threads Kernel

Frank W. Miller Cornfed Systems, Inc.

© 1999 by The USENIX Association All Rights Reserved

Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.

For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org

pk: A POSIX Threads Kernel

Frank W. Miller

Cornfed Systems, Inc.

www.cornfed.com

Intro duction pk makes use of a to ol called

noweb. The basic concept is simple. Both do cu-

mentation and co de are contained in a single noweb

pk is a new op erating system kernel targeted for use

le that uses several sp ecial formatting conventions.

in real-time and emb edded applications. There are

Two to ols are provided. noweave extracts the do -

twonovel asp ects to the pk design:

umentation p ortion of the noweb le and gener-

A

ates a do cumentation le, in this case a L T X le.

E

notangle extracts the source co de p ortion of the

 Documentation: The kernel is do cumented us-

noweb le and generates a source co de le, in this

ing literate programming techniques and the

case C source co de.

noweb [4] to ol in particular.

 POSIX Threads with Memory Protection: The

The main consequence of using literate program-

concurrency mo del is based on the POSIX

ming, and noweb in this case, is that changes to the

Threads aka Pthreads [2, 3] standard, how-

system after initial development are p erformed on

ever, the kernel also provides page-based mem-

the noweb source les. Since the co de is intermixed

ory protection using Memory Management

with its asso ciated do cumentation, it is more likely

Unit MMU hardware.

that the do cumentation will b e up dated at the same

time.

This short pap er discusses these facets of the pk ker-

This is the rst non-trivial pro ject I have under-

nel pro ject. The use of literate programming is pre-

taken using literate programming, and I have seen

sented rst, followed by a brief description of some

an evolution in my use of the to ols as I have pro-

of the pk design issues.

gressed with it. As with many pro jects, it was b e-

gun drawing on co de from another pro ject. In this

case, I drew on some elements of the Roadrunner

op erating system [1]. These were basic elements,

Literate Programming

like initialization, interrupt pro cessing, and memory

management, that were needed to get a new ker-

nel up and running quickly. These reused elements

Do cumentation is as imp ortant as the software it

were not do cumented with noweb initially and some

do cuments. This b elief led me to contemplate how

remain undo cumented still although my goal is to

to do cument the pk kernel design as a primary goal.

do cument the entire system using noweb over time.

In my exp erience, the biggest problem with gener-

ating do cumentation is that it often seems to be

The rst new element to b e written was the set of

a secondary activity, p erformed after the co de is

basic Pthread routines. I rst wrote the co de and

written. I b ecame interested in the p otential for

only after it was completed and tested to some de-

the documentation discipline asso ciated with liter-

gree, did I go back and overlay the do cumentation

ate programming techniques and decided to make

and formatting to turn the C source co de les into

use of these techniques with pk.

noweb les. This pattern rep eated itself during the

implementation of mutexes and condition variables.

By discipline, I refer to a structure within whicha

noweb do cumentation was added only after the fact.

pro ject is p erformed that provides an incentive to

generate do cumentation as the co de is b eing writ-

It happ ened that once I had completed the Pthreads

ten. Literate programming to ols provide a mecha-

routines, I decided to investigate the addition of pro-

nism that fosters such structure.

tected memory to the kernel. Design issues asso ci- small routines or the co de itself maybesointuitive

ated with this decision are discussed in the next sec- that only high-level prose is required to get across

tion. Continuing here, I want to discuss the implica- their function.

tions on do cumentation that presented themselves.

It was this decision that resulted in the rst signi - One interesting p oint ab out the use of literate pro-

cantchanges to existing source co de that had b een gramming seems to b e that the licensing asso ciated

do cumented using noweb. Sp eci cally, the mem- with the source co de must also apply to the do c-

ory management co de, which maintains the heap umentation, since the two are linked in the source

of available physical pages, and various parts of the les. pk is available under a BSD-style copyright,

Pthreads co de needed to b e up dated. which places essentially no restrictions on redistri-

bution. A similar pro ject released under the Gnu

My rst thought when I went to make changes to Public License GPL would require the changes to

the rst source co de le was, \don't worry ab out the do cumentation to be redistributed in addition

the do cumentation, you can come back a x that up to changes source co de.

later." I had no so oner op ened my second source le

when I realized I would forget what I had done if I There are a variety of literate programming to ols

didn't take care of the do cumentation. This would available. I evaluated , written by Donald

result in a do cument whose prose didn't match the Knuth, and noweb written by Norman Ramsey.

co de asso ciated with it. I had to go back and Knuth's cweb generates do cumentation of co de frag-

change it. This was the discipline I had hop ed would ments that are \pretty-printed", i.e. they havean

present itself. Iwent back and made the do cumen- algorithmic style reminiscent of textb o oks on com-

tation changes. puter science theory. The noweb to ols utilize a small

set of simple formatting rules and generate co de

At rst this felt cumb ersome, it added time to co de fragments that lo ok cosmetically like they were ex-

maintenance. However, two unexp ected e ects b e- tracted from a source co de le. This style seemed

gan to emerge. First, I found that my design was more in tune with a systems programming pro ject,

cleaner. When I mo di ed the co de and changed like an op erating system kernel, and so I decided to

the do cumentation, I thought ab out the problem use noweb over cweb.

twice. This led in several instances to a more concise

change. Second, I found that I could makechanges

more quickly in co de that I had not visited in a

while. It may seem obvious, but the do cumentation

Pthreads and Memory Protection

was right there next to the co de, and this allowed me

to refamiliarize myself with it more quickly. Ihave

now b egun to implement pieces of co de in the noweb

source format as they are written for the rst time.

pk is based on the POSIX Threads concurrency

The p ower of the conciseness e ect I discovered dur-

mo del. Pthreads were originally designed under the

ing maintenance is also present when writing co de

assumption that all of the threads would execute

and do cumentation together during an initial imple-

in the same address space. In fact, this address

mentation.

space was intended to be within a UNIX pro cess.

However, the Pthreads API is also used in real-

The granularity of the do cumentation varies over

time kernels that provide their applications a single,

di erent parts of the co de. There are several reasons

physical address space. pk is also targeted at real-

for this. Foremost, di erent areas of the co de have

time and emb edded applications, but it augments

b een do cumented at di erent times, and the do c-

the Pthreads design to include page-based memory

umentation for a particular segment of co de might

protection using the MMU. Such a design falls some-

not be p erformed all at one sitting. This results

where in b etween the basic Pthreads mo del and the

in sections of co de that are \complete", i.e. they

more substantial pro cess concurrency mo del.

are do cumented in great enough detail to under-

stand all asp ects of their semantics. Other p ortions

Since pk is targeted at time-critical applications,

are coarser, p erhaps only setup to t into the over-

paging and/or swapping to secondary storage can-

all structure of the piece of do cumentation, but not

not be utilized. This is b ecause of the signi cant

yet completed. There are also p ortions of co de that

lack of determinism intro duced bymoving memory

do not require heavy do cumentation. They maybe

pages back and forth to secondary storage.

If neither paging or swapping is used, applications p ointer values. They must each p oint to the b e-

are limited to the amount of physical memory on ginning of a valid region and the ownership and

a given machine. This fact raised the question of mappings for each region will b e transferred to

whether providing separate address spaces for each the new thread if the creation succeeds.

thread, in a manner akin to a pro cess, was desir-

pthread exit The retval return value can be

able. In my exp erience in emb edded systems de-

an arbitrary p ointer value. The return value

velopment, having direct access to sp eci c physical

typ e is changed to int in pk.

addresses can b e useful. For this reason, I decided

to map virtual to physical addresses one-to-one.

pthread join The retval parameter is used to

collect the return value from an exiting thread.

The MMU is used simply to restrict access to

This parameters typeischanged to int in pk.

memory lo cations, not to provide separate address

spaces. Three typ es of memory protection are pro-

vided:

Several of these changes represent further sp eci ca-

tion of parts of the Pthreads standard that are not

explicit. Changing the typ e of the return value rep-

Inter-thread: Threads may not access memory

resents a deviation from the standard. It is hop ed

b elonging to another thread.

that the impact of this change is minimal in co de

that might b e p orted to the pk system.

Kernel-thread: Threads may not access kernel

memory except through well-de ned system

call entry p oints.

Conclusion

Intra-thread: Co de segments asso ciated with a

thread can b e marked read-only.

pk is available under BSD-style copyright terms.

More information on the kernel is available on the

Restricting access to parts of memory violates the

at www.cornfed.com/pk. Downloads of source

assumption of a single unprotected address space

co de and b o otable oppy disk images are available

present in the Pthreads API design. There are a

at ftp.cornfed.com/pub.At the time of this short

variety of parameters in the API where p ointers

pap er submission, late April 1999, there have b een

capable of referencing arbitrary memory lo cations

approximately 650 downloads of the pk source co de

are utilized. Allowing arbitrary values to b e passed

distribution in the four months since its initial re-

through these parameters invites the generation of

lease was December 21, 1998. The interest in the

copius page and general protection faults.

system is quite gratifying and I lo ok forward to con-

tinued and expanded development.

Several areas of the API have b een scrutinized to

address p otential problems. The following list il-

lustrates some of the attention required for the

Pthreads API in pk. The list is not exhaustive, but

References

gives a avor of the kinds of issues in the API that

cause dicultyinthepk design.

[1] Cornfed Systems, Inc., Roadrunner Operating

System Reference, www.cornfed.com.

Data Structures: Several data typ es have b een

[2] IEEE, POSIX Std 1003.1c, www.ieee.org.

further sp eci ed. Reference typ es for pthreads

pthread t, mutexes pthread mutex t, and

[3] Pthreads,

condition variables pthread cond t cannot

www.mit.edu/people/proven/pthre ads. .

be typ ed as arbitrary p ointers. In pk, they are

de ned as integer indices into kernel tables.

[4] Ramsey, N., Noweb | A Sim-

ple, Extensible Tool for Literate Programming,

create: The values for the start func- pthread

http://www.cs.virginia.edu/ nr/noweb.

tion p ointer and arg argument p ointer repre-

sent p otentially arbitrary p ointer accesses. In

pk, semantic restrictions are placed on these