Prediction of protein-ligand binding affinities is an important issue in molecular recognition and virtual screening. We have developed a scoring function, namely GemAffinity, to predict binding affinities by analyzing 88 descriptors derived from 891 protein-ligand structures selected from the Protein Data Bank (PDB). Based on these 88 descriptors, we derived GemAffinity using a stepwise regression method to identify five descriptors, including van der Waals contact; metal-ligand interactions; water effects; ligand deformation penalties; and highly conserved residues interacting to a bound ligand with hydrogen bonds. GemAffinity was evaluated on an independent set, and the correlation between predicted and experimental values is 0.572. GemAffinity is the best among 13 methods on this set. Our GemAffinity was then applied to virtual screening for thymidine kinase (TK), human carbonic anhydrase II (HCAII), estrogen receptor of antagonists (ER) and agonists (ERA). Experimental results indicate that GemAffinity is able to reduce the disadvantages (i.e. preferring highly polar or high molecular weight compounds) of energy-based scoring functions. In addition, GemAffinity easily combined with other scoring functions to enrich screening accuracies. We believe that GemAffinity is useful to predict binding affinity and virtual screening.