This study compare twenty four panel unit root tests belonging to the null hypothesis of unit root and null hypothesis of stationarity on the basis of size and power properties using Monte Carlo simulations. Eighteen tests having the null hypothesis of panel unit root and six tests having the null hypothesis of stationary have been compared using stringency criterion discussed by Zaman (1996) to make comparison under a unified framework. For this comparison, first size of all (unit root and stationary) tests have been stabilized around nominal size of 5% by using simulated critical value instead of asymptotic critical values. The critical values are computed by Monte Carlo simulations assuming different level of the cross section in the panel and different time series length. After equalizing size of all panel unit root tests, power comparison of tests have carried out for both categories of tests for two specification of deterministic parts: with intercept term and both with intercept and trend terms. A standard bench mark on the basis of maximum shortcomings is made to identify best, mediocre and worst tests before making comparison for fixed cross section units with varying level of time series and vice versa. It is observed that De Wachter, Harris and Tzavalis (DWH); Im, Pesaran, and Shin (IPS); Levin, Lu, and Chu (LLC); and Westerlund (WT) tests having the null hypothesis of panel unit root are found to be best at small, medium, and large samples. The second category tests having the null hypothesis of stationary have concluded Hadri (HD) and Hadri and Larsson (HL) tests as the best performing tests as compared to other stationary tests in the both specification of deterministic cases. Empirical evaluation of best performing panel unit root tests have been carried out using purchasing power parity hypothesis on the basis of bootstrap method for sixteen OECD countries. The results of empirical study justify simulation study results for the best performing tests on the basis of empirical power.