Urban research is progressively moving towards fine-grained simulation and requires more granular and accurate geospatial data. In comparison to building footprints, roof structure lines (RSLs) are finer-grained elements of building roofs that provide a more sophisticated data reference. However, generating high-quality and up-to-date RSLs is arduous owing to the high expense of data sources (e.g., digital surface models and light detection and ranging data) and the low robustness of conventional image processing approaches. While the current combination of high-resolution satellite imagery and deep learning methods enables the automatic generation of RSLs, it also introduces two distinct challenges. First, the high diversity of roof sizes, forms, and spatial distribution complicates the extraction of essential RSL features from satellite imagery using general deep learning methods. Second, the significant class imbalance issue between foreground objects (i.e., RSLs) and background context in satellite imagery makes it difficult for deep learning methods to concentrate on RSL locations. To overcome these challenges and effectively delineate RSLs from satellite imagery, this study designs Deep Roof Refiner—an end-to-end and detail-oriented deep learning network and proposes a synthetic strategy to enhance the network’s performance. The effectiveness of the proposed network is verified by quantitative and qualitative experiments, with the optimal dataset scale F1-score and optimal image scale F1-score of 60.89% and 63.48%, respectively. The proposed network significantly outperforms state-of-the-art deep learning methods and associated conventional research. The results indicate that the delineated RSLs can serve as a reliable data source for some urban building-based studies.