Due to the high demand of spectrum utilization, cognitive radio (CR) network has been a promising solution to the problem of spectrum scarcity by using dynamic spectrum access technique. In this paper, we study one of the CR network architectures where the CR base stations (CRBSs) demand spectrum resources for the CR users to directly access and utilize. We applied an economical Cournot Game model to the system where the CRBSs are the players in this game. In order to optimize the game, we propose a stochastic learning (SL) based scheme for the CRBSs to adjust the demand amount of resources based on the action-reward history. Numerical results show the convergence toward a Nash Equilibrium (NE) point, and the system performs well in terms of the total utility comparing with other schemes.