On indie iOS and Mac development.

Archive for January, 2012

Displaying and Searching PDF Content on iPhone

Friday, January 20th, 2012

This is actually a repost… this was originally posted on my old blog, which is now defunct. Apparently, it got stuck in some search engine, or linked to from Stack Overflow or something… because I get regular (about 1-2 posts a month) asking that I put it back up. The code here isn’t really that good… it’s kinda a hack… and if you know a better way to do this, please speak up. Furthermore, this is the sum total of everything I know, or ever WANT to know about PDFs… so if your question isn’t answered here, don’t bother emailing me… I have no idea. I’m pretty sure that reading the PDF spec in some way causes your soul to be condemned to an eternity of spanking by the CEO of Adobe… or something.


PDF parsing is a black art that most programmers avoid. “Madness lurks here.” They mumble to themselves quietly. Choosing instead to push their PDFs through UIWebViews and commit other crimes against humanity.

It doesn’t have to be this way, however. Parsing, displaying, and searching PDFs natively and at a low level is actually surprisingly easy if you’re not afraid to get your hands a little dirty with the Core Graphics PDF functions. I’m going to show you how.

Where’s it hiding?

The first thing to know is that in order to do this, you need to use Core Graphics calls. So you need to include the Core Graphics framework in your project, and in any files you want to use the calls, you have to include the CoreGraphics.h header. It’s probably also worthwhile to review the Core Foundation memory management rules.

Once you’ve done this, it’s very straight forward to read your PDF files and display them in a custom view. Let’s take a look at how we do that.

Initializing a PDF Document

To initialize a PDF document, you first have to use the call CGPDFDocumentCreate, passing in the URL to the document you want to open. Since NSURL is toll free bridged to CFURLRef, you can create a CFURLRef just using plain old NSURL like so:

NSString *pathToPdfDoc = [[NSBundle mainBundle] 
                                    pathForResource:@"mypdf" ofType:@"pdf"];
NSURL *pdfUrl = [NSURL fileURLWithPath:pathToPdfDoc];

Then, to create the CGPDFDocumentRef, call CGPDFDocumentCreateWithURL:

CGPDFDocumentRef document = CGPDFDocumentCreateWithURL((CFURLRef)pdfUrl);

Displaying Pages

So now you have a document. To display the content of the document, you have to get the content in the form of pages. PDFs are already formatted by pages, so all you need to do is get at that data. Fortunately Core Graphics has functions for that too.

To get the total count of the pages in the document, you use the call CGPDFDocumentGetNumberOfPages, which takes as a parameter, the document you created above. So, for example:

size_t pageCount = CGPDFDocumentGetNumberOfPages(document);

Then, to get an individual page to display in your view, you use the function CGPDFDocumentGetPage, passing the document and the page number you want. Like so:

CGPDFPageRef page = CGPDFDocumentGetPage(document, currentPage);

Note that the currentPage parameter here is 1 based, not 0 based as is the usual case in programming. This means that the first page of the PDF document is in fact, page 1, and not page 0.

Once you have the page, you can display it in your custom view. The only complicated part here is that on iPhone, the coordinate system is flipped compared to the Mac. This causes a problem because the Core Graphics PDF system uses the desktop coordinate system even on iPhone. (It’s yucky, I know.) The solution to this is to flip the page (this can be done in your drawRect method when you go to draw the content):

CGPDFPageRef page = CGPDFDocumentGetPage(document, currentPage);

CGContextRef ctx = UIGraphicsGetCurrentContext();

CGContextSaveGState(ctx);

CGContextTranslateCTM(ctx, 0.0, [self bounds].size.height);
CGContextScaleCTM(ctx, 1.0, -1.0);
CGContextConcatCTM(ctx, 
       CGPDFPageGetDrawingTransform(page, kCGPDFCropBox, [self bounds], 0, true));

The key here is the call to CGContextScaleCTM. What we do, is we get the current drawing context, and then we scale it’s coordinate system on it’s y axis by -1.0. This, effectively, flips it upside down along it’s horizontal (x) axis.

Finally, we draw the page into the context using the CGContextDrawPDFPage function:

CGContextDrawPDFPage(ctx, page);    
CGContextRestoreGState(ctx);

So basically, a full on drawRect method for a custom view that draws content from a PDF page, looks something like this:

-(void)drawRect:(CGRect)inRect;
{
    if(document)
    {
        CGPDFPageRef page = CGPDFDocumentGetPage(document, currentPage);

        CGContextRef ctx = UIGraphicsGetCurrentContext();

        CGContextSaveGState(ctx);

        CGContextTranslateCTM(ctx, 0.0, [self bounds].size.height);
        CGContextScaleCTM(ctx, 1.0, -1.0);
        CGContextConcatCTM(ctx, 
                     CGPDFPageGetDrawingTransform(page, kCGPDFCropBox, 
                      [self bounds], 0, true));

        CGContextDrawPDFPage(ctx, page);    
        CGContextRestoreGState(ctx);
    }
}

That’s all there is to it!

Searching PDFs

One of the things that seems to be particularly scary to programmers is searching PDFs. I agree that it’s certainly not pleasant stuff to code, but it’s not hard either.

Now, I want to preface this by saying that I feel this code is a bit of a hack, but it definitely works, and seems to work quite well. Perhaps there’s a better way to do this, and if you know of one, please let me know. That said, however, here’s how I’ve done it.

The first thing to know is that PDF files are made up of operators which delineate the data within them. So, for example, all text in a PDF document is stored as glyphs and prefixed by operators of type either “Tj”, in the case of a string, or “TJ” in the case of an array of strings. Knowing this, you can access the PDF data as a stream and create a scanner which will call callback methods you specify when these operators are encountered. You can then retrieve the data after the operator and use it to build your search corpus.

That probably sounds intimidating, but it’t not. You start out by creating a class that will be your “page searcher.” This will hold the state for your search engine. Here’s the listing for the interface for this class:

#import <Foundation/Foundation.h>

@interface PDFSearcher : NSObject 
{
    CGPDFOperatorTableRef table;
    NSMutableString *currentData;
}
@property (nonatomic, retain) NSMutableString * currentData;
-(id)init;
-(BOOL)page:(CGPDFPageRef)inPage containsString:(NSString *)inSearchString;
@end

Pretty straight forward stuff. We use the currentData member to store the text of the page being scanned. This is a member variable rather than a local variable because we’re going to be using C functions to fill it in. Don’t worry, that’ll make sense in a moment.

The init method for the class actually creates the callback table:

-(id)init
{
    if(self = [super init])
    {
        table = CGPDFOperatorTableCreate();
        CGPDFOperatorTableSetCallback(table, "TJ", arrayCallback);
        CGPDFOperatorTableSetCallback(table, "Tj", stringCallback);
    }
    return self;
}

The arrayCallback and the stringCallback functions are C functions that will be called by the scanner. They’re shown here:

void arrayCallback(CGPDFScannerRef inScanner, void *userInfo)
{
    PDFSearcher * searcher = (PDFSearcher *)userInfo;

    CGPDFArrayRef array;

    bool success = CGPDFScannerPopArray(inScanner, &array);

    for(size_t n = 0; n < CGPDFArrayGetCount(array); n += 2)
    {
        if(n >= CGPDFArrayGetCount(array))
            continue;

        CGPDFStringRef string;
        success = CGPDFArrayGetString(array, n, &string);
        if(success)
        {
            NSString *data = (NSString *)CGPDFStringCopyTextString(string);
            [searcher.currentData appendFormat:@"%@", data];
            [data release];
        }
    }
}

void stringCallback(CGPDFScannerRef inScanner, void *userInfo)
{
    PDFSearcher *searcher = (PDFSearcher *)userInfo;

    CGPDFStringRef string;

    bool success = CGPDFScannerPopString(inScanner, &string);

    if(success)
    {
        NSString *data = (NSString *)CGPDFStringCopyTextString(string);
        [searcher.currentData appendFormat:@" %@", data];
        [data release];
    }
}

As you can see, these will be called when the operators fire. When they do, we pop the data off the scanner, and add it to the searcher’s corpus. The userinfo pointer is actually pointing to our searcher object (based on the fact that we will pass it as the second parameter to @CGPDFScannerCreate@ in the next code). So we can typecast it to a PDFSearcher and then access that currentData member (remember I said it would make sense later?).

The actual search method looks like this:

-(BOOL)page:(CGPDFPageRef)inPage containsString:(NSString *)inSearchString;
{
    [self setCurrentData:[NSMutableString string]];
    CGPDFContentStreamRef contentStream = CGPDFContentStreamCreateWithPage(inPage);
    CGPDFScannerRef scanner = CGPDFScannerCreate(contentStream, table, self);
    bool ret = CGPDFScannerScan(scanner);
    CGPDFScannerRelease(scanner);
    CGPDFContentStreamRelease(contentStream);
    return ([[currentData uppercaseString] 
          rangeOfString:[inSearchString uppercaseString]].location != NSNotFound);
}

Basically, we create a stream from the page data, then use that and our callback table to create a scanner. We then scan the data. It’s at this point our currentData member is being filled with the data from the PDF as strings. Finally, we just search that string for our search string.

Easy peezy.

Note: much of this code is only sight compiled. I pulled it from some code I had, but it wasn’t a straight across copy, so if you find an error, please let me know.

On Preprocessor Macros, ARC and Open Source

Thursday, January 19th, 2012

I get a few pull requests here and there from folks who want me to convert my blocks code to ARC. Rest assured that I will, eventually, but for now, I’m keeping it ARC agnostic so that people who have not converted to ARC can continue to use it. It’s easier to enable/disable ARC on open source code you download than it is to force people who have not converted to ARC to put the releases and retains back into that same code. Believe me, it’s as annoying for me, as it is for you that said code is not ARC. All of my projects are now ARC, and I wouldn’t code any other way. However, I too, use -fno-objc-arc on my own open source code when needed.

That said, some have suggested some alternative ways of handling open source and ARC. Most of them consist of using preprocessor macros to determine if ARC is enabled and to execute an alternative code path depending on that state. For example:

-(void)doSomething
{
    Foo *foo = [[Foo alloc] init];
    ...
#ifdef !__has_feature(objc_arc)
    [foo release];
#endif
}

I believe that this is a fundamentally wrong way to handle this situation, and I’d like to explain why.

Macros are Ugly

Macros are a very valuable tool. In fact, their purpose is to do exactly what you, young padawan are intending to do. Their purpose is to look at what type of environment your code is being compiled for, and to enable you some level of control as to what code gets compiled in that environment. However, they are also a very blunt, and error prone tool. They have no type checking and code that is cluttered with excessive macro usage is extremely hard to follow. Because of the way that macros work even compiler errors generated from macro generated code is often misleading, or downright impossible to read. Using macros all over your code like this to execute alternative code paths is fraught with peril.

In some cases, rather than sprinkling these #ifdefs all over the place in their code, some folks might simply think that they should #define their own retain/release replacements that do the switching between ARC and non-ARC for them. I just can’t bring myself to think that this is a better solution. In fact, I think it’s worse. Sprinkling MY_RELEASE(foo); all over is disconcerting to read, and will result in wasted time for other coders who not only can, but should check to see what MY_RELEASE actually does.

In addition to these, the de facto standard of naming preprocessor macros in all caps results in a codebase that more closely resembles a CHOCKLOCKed email from your crazy computer illiterate uncle than the delicate prose for which the admittedly verbose Objective-C language is known for being.

This is not to say that preprocessor macros are totally to be avoided by everyone. If you’re building an extensive framework, or if you are a systems level programmer, then you definitely have very justifiable cause for using preprocessor macros. Particularly in cases where you want to maintain compatibility across multiple platforms, or in cases where you need to help inform programmers using your code about how it needs to be used (i.e.: By having macros that show warnings for deprecations, etc.) There is probably also a small minority of programmers who might need to use macros for particularly performance critical code. If you’re one of these types of programmers, then you have my condolences.

In the case of my open source code, it does not fall into any of these categories. It’s simply one or two files, you drop into your project… and away you go. There is simply no need for using macros.

How should ARC be handled in this case?

Simple. There’s an excellent tool that the compiler provides for you. It’s the -fobjc-arc and -fno-objc-arc flags. You can set them on any file in your project just by going to the “Build Phases” part of your Xcode project and setting those as compiler flags on the files in question.

Admittedly, if your framework is very large… this could be an onerous task. In my case, however, these open source components are very small. One to two files at most.

If a user includes a non-ARC file in an ARC project, they’ll get an error when they compile it because retain and release and autorelease are deprecated. So the developer will be informed and be able to take the appropriate action. However, if you include an ARC’d file in a non-ARC project, it’ll just silently compile and leak.

To solve this, you should use a macro, but not to have conditional code paths… you should use a macro to inform the developer that they need to use ARC for that file. Here’s a simple one that I use:

#if ! __has_feature(objc_arc)
#error This file must be compiled with ARC.
#endif

This will prevent the developer from using this file unless they set the correct compiler flags. It’s simple. It’s clean. It’s out of the way, and it doesn’t clutter the code. Put this at the top of your implementation files when you convert them to ARC.