Str(3S) C STRING FUNCTIONS Str(3S) NAME String and block-data package. SYNOPSIS #include cc -o ... str.o libV.a Str* AppStr(Str* dst, char* src, int len); Str* CatStr(Str* dst, Str* src1, Str* src2); Str* ChkStr(Str* src); Str* CmpStr(Str* st1, Str* st2); Str* CpyStr(Str* dst, Str* src); Str* DupStr(Str* dst, Str* src); Str* GetStr(Str* dst, char* src, int len); Str* MinStr(Str* dst, int len); Str* MakStr(Str* dst, char* src, int len); Str* ZapStr(Str* dst); Blk* AppBlk(Blk* dst, char* src, int len); Blk* CatBlk(Blk* dst, Blk* src1, Blk* src2); Blk* ChkBlk(Blk* src); Blk* CmpBlk(Blk* st1, Blk* st2); Blk* CpyBlk(Blk* dst, Blk* src); Blk* DupBlk(Blk* dst, Blk* src); Blk* GetBlk(Blk* dst, char* src, int len); Blk* MinBlk(Blk* dst, int len); Blk* MakBlk(Blk* dst, char* src, int len); Blk* ZapBlk(Blk* dst); ALSO NEEDED The "str" package and the "V" (audit/debug/verbose) packages call each other; you will need both of them to use either. You need not include V.h or local.h if you include str.h but all three must be present for compiles to succeed. DESCRIPTION This package defines a dynamic string and block-data structure for C, and a set of routines for creating, and manipulating them. The string and block structures looks like: #define Str struct _string_ struct _string_ { char* v; /* The string itself */ ulong l; /* Actual length of string */ ulong m; /* Number of bytes malloc'd for the string */ char* p; /* Some applications need an internal pointer */ Flag f; /* Some applications need a set of flag bits */ }; #define byte unsigned char #define Blk struct _block_ struct _block_ { byte* v; /* The block of data */ ulong l; /* Actual length of block */ ulong m; /* Number of bytes malloc'd for the block */ char* p; /* Some applications need an internal pointer */ Flag f; /* Some applications need a set of flag bits */ }; Note that the only difference between the Str and Blk structures is the type (signed versus unsigned) of the v field. The function calls listed above are actually macros that call the same routines for both structures. Use whichever is most convenient. Note also that the p and f fields are optional, depending on how the package was compiled on a particular system or for a particular set of applications. If the package is ever added to an official system library, we will have to decide whether or not to make these fields permanent. In the above synopses, the following conventions are used: dst is a Str* that is filled in. If null, it is allocated. src is a Str* or char* that is used as input. It may be null. len is a character count, and must be nonnegative. Note that all the Str functions allow any of the parameters to be null. For input parameters, a null pointer is a null string. For output parameters, a null pointer means to malloc space for the Str. Failure can occur when malloc fails, i.e., when the program is out of memory; this will produce an error message (if Vlvl > 1), and a null value is returned. Otherwise, the return value is always the dst string. The one exception to this is ChkStr(), which checks its parameter for validity, and returns the number of problems that it finds. The rules that ChkStr() enforces are: If the v field is null, l and m must be zero. This is one way to represent a null string. Any Str with l == 0 is null. If the v field is nonnull, m may be zero or nonzero. If m is zero, it means that the v field was not gotten by malloc, and can not be freed. If the string must be expanded, a new v is allocated and the old v copied to the new, but the old v is not freed. If m is nonzero, it is the allocated size of v; enlarging the string means freeing the old value (after copying its contents to the new v, of course). If m is nonzero, l must be in the range [0,m]. If the p field is nonzero, it must be within [v,v+l]. Note that applications must NEVER assume that the v pointer is the same before and after one of these functions is called. If a string must be expanded, it will be, and the v pointer will change. This is also true of many routines that use this package. WARNING An important programming problem in C is that most compilers don't allow giving initial values to a struct within a function. Thus when you enter a function, any local Str contains garbage in its fields. If you pass it to any of the string-handling routines, the program will probably die with a memory fault from following the v pointer. Unfortunately, there is in general no way that a C function can validate a pointer passed to it, and the string routines don't try. You should use the InitStr(s) macro, which zeroes all the fields, for any such local Str variables, or do explicit assignments to all the fields. This is not normally a problem with global or static Str variables, though it is wise to give them an explicit {0} initial value. FUNCTIONS The string-manipulating routines are as follows; change "Str" to "Blk" throughout to get the corresponding block routines. AppStr(Str* dst, char* src, int len) appends the len bytes at src to dst->v, calling malloc() as necessary to create dst and dst->v to hold the new longer value. CatStr(Str* dst, Str* src1, Str* src2) catenates the values of the src1 and src2 strings, and puts the result into dst. It is legal to call CatStr with dst the same as str1 or str2; the right thing is done in either case. ChkStr(Str* src) does a validity check on src, and returns a count of the number of problems found. If Vlvl > 1, error messages will be produced on the Vout stream. In some versions, this routine may produce warning messages that aren't counted; the error count includes only errors that are likely to trigger later errors when the string is used. CmpStr(Str* st1, Str* st2) does a lexical comparison of st1 and st2 and returns -1, 0 or +1 as st1 is before, equal to, or after st2. The current implementation of both CmpStr and CmpBlk uses 8-bit, unsigned character values for the comparison, as no situations have been seen where signed-char comparisons are correct. CpyStr(Str* dst, Str* src) makes a copy of the src string. The new string will be expanded as necessary. If src is null, then the effect is like ZapStr(dst). DupStr(Str* dst, Str* src) is a synonym for CpyStr(); it is a relic of the fact that this package was developed for several different jobs, and the packages were merged. GetStr(Str* dst, int len) makes dst into a null string whose v has room for at least len bytes. The m field is checked, and if m < len, a new chunk of memory is malloced, and the old (if any) is freed. The l field is always set to zero. InitStr(Str dst) is a macro that stuffs zeroes into all the fields of the string. It is used mostly, as noted above, to initialize local Str variables within a function, because most C compilers don't allow giving initializers for local structs. Note that the parameter in this case is NOT a pointer, it's the Str variable. MinStr(Str* dst, int len) makes sure that the dst string has space allocated for at least len bytes. If dst->m < len, a new v will be allocated, and the old v (if any) will be copied to the new v. Either arg may be null; if dst is null, and new Str will be allocated and returned; if len is zero, the value will be allocated to its old size. (Actually, the current implementation makes sure that there is at least one extra byte, and fills it with a null, so that strings may be passed to printf() safely.) MakStr(Str* dst, char* src, int len) makes dst contain the len bytes at src as its value. If dst->l or dst->m is already big enough, this is merely a copy. If there isn't room, dst is expanded by calling malloc(len), possibly freeing the old value, and copying the data from src to dst->v. ZapStr(Str* dst) makes dst into a null string. If m > 0, the v will be freed. Then all the fields are set to zero. This is much like free(), but the Str struct itself is not freed. The return value is always dst; failure isn't possible. DEBUGGING The debug version of this package also defines a parallel set of routines whose names end with 'M', and which include an extra parameter that is a (char*) that is used in messages. Thus if you call MinStrM(&st,n,"Init file") rather than MinStr(&st,n), and there isn't space for n bytes, you will get an error message including the string "Init file" and the value of n. You can then just write the call as: if (!MinStrM(&st,n,"Init file")) Fail; and a meaningful error message will be generated for you. John Chambers' audit/debug/verbose package is also used by the string package, and V.h is #included by str.h, so you don't need to #include it yourself, though it doesn't hurt if you do. This means that you must use -lV.a (or -laudit.a or -lverbose.a) when you use this string package. SEE ALSO string(3)