The control system of many complex mechatronic products requires for each task the Worst Case Execution Time (WCET), which is needed for the scheduler's admission tests and subsequently limits a task's execution time during operation. If a task exceeds the WCET, this situation is detected and either a handler is invoked or control is transferred to a human operator. Such control systems usually support preemptive multitasking, and if an object-oriented programming language (e.g., Java, C++, Oberon) is used, then the system may also provide dynamic loading and unloading of software components (modules). Only modern, state-of-the art microprocessors can provide the necessary compute cycles, but this combination of features (preemption, dynamic un/loading of modules, advanced processors) creates unique challenges when estimating the WCET. Preemption makes it difficult to take the state of the caches and pipelines into account when determining the WCET, yet for modern processors, a WCET based on worst-case assumptions about caches and pipelines is too large to be useful, especially for big and complex real-time products. Since modules can be loaded and unloaded, each task must be analyzed in isolation, without explicit reference to other tasks that may execute concurrently. To obtain a realistic estimate of a task's execution time, we use static analysis of the source code combined with information about the task's runtime behavior. Runtime information is gathered by the performance monitor that is included in the processor's hardware implementation. Our predictor is able to compute a good estimation of the WCET even for complex tasks that contain a lot of dynamic cache usage, and its requirements are met by today's performance monitoring hardware. The paper includes data to evaluate the effectiveness of the proposed technique for a number of robotics control kernels that are written in an object-oriented programming language and execute on a PowerPC 604e-based system.